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Preface 


The outlouk of this second edition is ihe same as (hat of the original: it) present linear 
algebra as the theory and practice of linear spaces and linear mappings. Where it aids 
understanding and calculations, 1 don’t hesitate to describe vectors as arrays of 
numbers and to dcscrihc mappings as matrices. Render onto Caesar the things which 
are Caesar's. 

If you can reduce a mathematical problem to a problem in linear algebra, you can 
mosl likely solve it t provided that you know enough linear algebra. Therefore, a 
thorough grounding in linear algebra is highly desirable. A scnind undergraduate 
education should offer a second course on the subject, at the senior level, I wrote this 
hook as a syllable text for such a course. The changes made in this second edition are 
partly to make it more suitable as text. Terse descriptions, especially in the c^rly 
chapters, were expanded, more problems were added, and a Hsl of solutions to 
selected problems has been provided* 

In addilion, quite a bit of new materia] hi\s been added, such as the compactness 
of the unit ball as a criterion of finite dimensionality of a normed linear space. A new 
chapter discusses the QR algorithm for finding the eigenvalues of a self-adjoint 
matrix. The Householder algorifhtn for aiming such matrices into tridiagonat form is, 
prescnled. 1 describe in some tlciail ihe beautiful obHervadon of Dei ft, NaiicUi, and 
Tomei of the analogy between the convergence of the QR algorithm and Moser’s 
theorem on the asymptotic behavior of the Tod a flow as time tends lo infinity. 

Eight new appendices have been added to the first edition’s original eight, 
including the Fust Fourier Transform, the spcctml mdius theorem, proved with the 
help of the Schor factorization of matrices, :i】id an excursion into the theory of 
matrix-valued analytic f'unciions. Appendix ] 1 describes ihe Lorentz group, 12 is an 
interesting application of the compactness criterion for finite dimensionality, 13 is a 
characteriiation of conimytators. 14 presents a proL>f of Liapmmv’s stabiliiy 
criterion, 15 presen is the conslruction of the Jordan Canonical form of matrices, and 
16 describes Carl Pearcy f s elegant proof of Halmos" conjee lure iiboul the numeric^] 
range of matrices. 

1 conclude wilh a plea to include the simplest aspects ol' linear algebra in high- 
school teaching: vectors with two and ihree components, the scalar product, the 


\i 
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cross product, the description of rotations by maiden and applications to geometry. 
Such modernization of the high-school curriculum is long overdue. 

I acknowledge with pleasure much help I have received from Ray Midialek, as 
well as useful conversations with Albert Novikoff and Charlie Peskin. I also would 
like to thank Roger Horn, Beresford Parlett, ami Jerry Kazdan tbr very useful 
comments, and Jeffrey Ryan for help in proofreading. 


Piter D, Lax 



Preface to the First Edition 


This hook is based on li tedure course designed lbr entering graduate slucbnts and 
given over a number of years al the Courant Institute of New York University. The 
course is open also to qualified undergraduates and on occasion was attended by 
talented high school students，among them Alan Edelman; I am proud to have been 
the first to teach him linear algebm. But, apart from special cases, the book，like the 
course, is for an audience that has some — not much — familiarity with linear algebra. 

Fifty years ago. linear algebra was on its way out m a subject for research. Yet 
during the past five decades there has been an unprecedented outburst of new ideas 
about how to solve linear equations, carry out least square procedures, tackle 
systems of linear inequalities, and find eigenvalues of matrices^ This outhurst clinic 
in response to the opportunity created by the availability of ever faster computers 
with ever larger memories- Thus, linear algebra was thrust center 血 gc in numerical 
mathematics. This had a profound effect, partly good, partly bad, on how the subject 
is taught today. 

The presenlcilion of new numerical methods brought fresh and exciting material^ 
m well as realistic new applications, to the classroom. Many students, alter all, are in 
a linear algebra class only for the applications. On the orher hand, bringing 
applications [ind algorithms to the foreground has obscured the structure of linear 
algebra-~a trend I deplore; it does students a great disservice to exclude them from 
the paradise created by Emmy Noether and Emil Artin. One of the aims of this book 
is to redress this imbalance. 

My second aim in writing this hook is tn present a rich selection of analytical 
results and some of Lheir applications: miiliix inequalities, estimates for eigenvalues 
and delermimints, and so on. This be^iulLfiil aspect of lineni algebra, so useful fur 
working analysts and physicists, is often neglected in texts, 

I strove to choose proofs that are revealing, elegant, and short. When ihere are 
two dilferent ways of viewing a problem, I like to present both. 

The Contents describes what is in the book. Here 1 would like to explain my 
choice of materials and their treatincnt. The first four chapfers describe the abstract 
theory of linear spaces and linear irun^fbrmaiions. In the proofs I avoid elimination 
of the unknowns one by one. but use (he linear structure ： I particularly exploit 
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PREFACE TO THE FIRST EDITION 


quoticni spaces as a counting device. This dry material is enlivened by mmc 
nontrivial applicaiinns to quadrature, to interpolation hy polynomials, and to solving 
the Dirich let problem for ihc discretized Laplace equation. 

In Chapter 5, dettimiinanbs arc motivated geometrically as signed volumes of 
ordered siniplices. The basic algebraic properties of delermimints follow immediately. 

Chapter 6 is devoted to the spectni 】 theory of arbitrary square matrices with 
complex eniries. The completeness of eigenvectors and generalized eigenvectors is 
proved without ihe characteristic equation, relying only on ihe divisibility theory of 
the algebra of polytioinials. In ihe name spirit we show that two matrices A ami B are 
similar if and only if (A — k}) m and (B — kl)^ have nullspaces of the same 
dimension for all complex k and all positive integer m. The proof of this proposition 
leads to the Iordan canonical form. 

Euclidean stmeture appears for the first time in Chapter 1. Ft is used in Chapter 8 
to derive the spcctml theory of self adjoint matrices* Wc present two proofs, one 
based on ihe spectral theory of general matrices, the other using the variational 
characterization of eigenvectors and eigenvalues. Fischer’s min max theorem is 
explained. 

Chapter L ) deals with the calculus of vector- and matrix-valued functions of a 
single variable, an importanl Lopic not usually discussed in the undergraduate 
curriculum. The most important result k the continuous and differentiable character 
of eigenvalues and normalized eigenvectors of differentiable matrix functions, 
provided that appropriate nondegeneracy conditions are satisfied. The fascinating 
phenomenon of “avoided crossings’’ is briefly described arid explained. 

The first nine chapters, or certainly the first eight* constitute the core of linear algebra. 
The next eight chapters deal with special topics, to he taken up depending on the interest 
of the instaictor and of the students. We shall commetu on them veiy briefly. 

Chapter 10 is a symphony of ineL|ualities about matrices, lheir eigenvalues, and 
their determinants. Many of the proofs make ose of calculus* 

1 included Chapter I I to make lip for the unfortunale disappearance of mechanics 
from the curriculum and lo show how nri；Ltrices give an clegtinl description l>1 motion 
in space. Angular velocity of a rigid body and divergence and curl of a veclor field all 
appear naturally* The nionoronic dependence of eigenvalues of symmetric matrices 
is used to show that the natural frequencies of a vibrating system increase if the 
system is stiffened and the masses are decreasetL 

Chapters 12, 13, and 14 are linked together by ihe notion of convexity, In Chapter 
12 we present the descriptions of convex sets in terms of gauge functions and support 
functions. The workhorse of the subject, the hypcrphmc separation ihcorcm, is 
proved by means of the Hahn-Banach procedure. Caraiheodory's tlieorem on 
extreme points is proved and used to derive the Konig-Birkhoff theorem on doubly 
stochastic matrices ； Belly's theorem on ihc intersection of convex sets is stated and 
proved. 

Chapter 13 is on linear inequalities; the FTirka^-Minkowski theorem h derived 
and used to prove (lie duality theorem，which then is applied in the usual fashion lo a 
maximuni-niiiiimum problem in economics, and to ihe min max theorem of von 
Neumann :\hou\ two-person zero-sum games. 
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XV 


Chapter 14 is on nornncd linear spaces; U k mostly srandard fare except for a dual 
characterization of the distance of a point from a linear suhspace* Linear mappings 
of normed linear spaces are discussed in Chapter 15. 

Chapler 16 presents Perron's beautiful ihet^rom on matrices all of whose cnlrics 
are positive. The standard application to the asymptotics of Markov chains is 
described. In conclusion, the theorem of Frobenius about the eigenvalues of matrices 
with non negative entries is staled and proved. 

The last chapter discusses various strategies for solving iteratively systems of 
linear equations of the form Ax = ft, A a self-adjomi, positive matrix, A variational 
formula is derived and ii steepest descent method is analyzed. We go on to present 
several versions of itcratioris employing Chebyshcv polynomials* Finally we 
describe the conjugate gradient method in terms of orthogonal polynomials. 

It is with genuine regret that I omit a chapter on the numerical calculation of 
eigenvalues of self-adjoint matrices. Astonishing connections have been discovered 
recently between this important subject and other seemingly unrelated topics. 

Eight appendices describe material that does not quite fit into the flow of the text ， 
but that is so striking nr so iniponam that it is worth bringing to the attention of 
siudenls. The topics 1 have diosen ure special determinants that can be evaluated 
ex pi icily, PJafFs theorem, Jiympleclic matrices, tensor product ， lattices, Slrassen's 
algorithm for fmx matrix multiplication, Gershgorin's theorem, and the inulliplicily 
of eigenvalues. There arc other equally attractive topics that could have been chosen; 
the Baker-Campbell-Hausdorff Ibmuile, the Kreiss matrix theorem, numerical 
range, and the inversion of tridiagonal matrices* 

Exercises are sprinkled throughout the text: a few of them arc routine; most 
require some thinking and a few of them require sonic computing. 

My notation h neoclassical. I prefer to me (bur-letler Anglo-Saxon words like 
“into:’ 44 omcf , and rather lhan polysyllabic ones of Norman origin. The 

end of a proof is marked by an open square. 

The bibliography consists of the usual suspects and some recem texts; in addition, 
I have intladcd Coimint-Hilbert, Volume I, unchanged fwm the original German 
version in 1924. Several generations of mathematicians and physicists, including the 
author, first learned linear algebra from Chapter 1 of this source* 

1 am grateful to my colleagues at Ihe Courant Inslitute and to Myron Allen al the 
University of Wyoming for reading and commenting on the manuscript and for 
trying out parts of it on rheir classes, I am grateful to Connie Engle and Janice Want 
for their expert typing, 

I have learned a great deal from Richard Bellman's outstanding hook, 
Introduction to Matrix Analysis; uh influence on the present volume is considerable. 
For this reason and to mark a friendship that began in 1945 and lasted until his death 
in 1984, 1 dedicate this book t{i his memory. 


New York, New York 


Peter D. Lax 




CHAPTER 1 


Fundamentals 


This first chapter aims to introduce the notion of an abstracl linear space to those 
who think of vectors as arrays of components. I want to point out that the class of 
abstract linear spaces is no larger than the class of spaces whose elements arc arrays. 
So what is gained by this abstraction? 

First of all, the freedom to use a single symbol for an array; this way we can think 
of vectors as basic building blocks, unencumbered by components- The abstract 
view leads to simple, transparenl proofs of results* 

More to the point, the elements of many interesting vector spaces are not 
prcscnled in terms of components. For instance, take a linear ordinary diffcrenlial 
equation of degree n\ the set of its solutions form a vector space of dimension fh yet 
they are not presented els arrays. 

Even if the elements of a vector space are presented as arrays of numbers, the 
elements of a subspace of it may not have a natural description as arrays. Take，for 
instance, the subspace of all vectors whose components add up to zero. 

Last but not least, the abstract view of vecior spaces h indispensable for infinile- 
dimensional spaces; even (hough this text is strictly about finite-dimensiofial spaces, 
it is n good preparation for functiomil analysis. 

Linear algebra abstracts the two basic operations with vectors: the addition of 
vectors, and their multiplicoition by numbers (scalars). It is astonishing thcit on such 
slender foundations an elaborate structure can be built，with romanesque, gothic, and 
baroque aspects. It is even more astounding that linear algebra has not only the right 
theorems but also the right language for many mathemalica] topics，including 
applications of mathematics. 

A linear space X over a field K is a mathematical object in which (wo operations 
are defined: 

Addition, denoted by +, as in 
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2 UNEAR AI.GF.BRA AND ITS APPMCATIONS 

and assumed to be commutative: 


and associative: 


x -ty = y + x, 


( 2 ) 


x+ (y + z) = (x + y) +z, (3) 

and to form a group, with the neutral elemem denoted as 0: 

x+0 = x. (4) 

The inverse of addition is denoted by 

x H~ ( _ x) = x _ = 0, (5) 


Exercise i. Show that the t)f vector addition is unique. 

The second operation is multiplication of elements of X by elements k of the 
field K: 

kx. 

The result of this mulliplication is a vector, that is, an element of X. 

Multiplication by elements of K is assumed to be associative: 

k(ax) = (ka)x (6) 

and distributive: 

+ y) = =kx 4 - ky^ (7) 

as well as 

(a + = ax + bx^ (8) 

We assume that moltiplication by the unit of denoted ns 1, acts as ihe identity: 

Ia = x. (9) 

These are the ajcioms of linear algebra. We proceed to draw some deductions: 

Set b = 0 in (8); it follows from Exercise 1 that for all a- 


Oj = 0, 


(10) 
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Set a = 1,6 = - I in ( 8 ); using (9) and (10) we deduce that for all x 

(—l);c ：= —x 

Ekkkcise 2 * Show that the veciur with all componenis zero serves m ihe zero 
dement of classical vector addition. 


In this analytically oriented text the field K will be either the field U of real 
numbers or the field C of complex numbers. 

An interesting example of a linear space is (Jie set of all functions .r(/} that satisfy 
the ditTereniial ec]u:Uion 


+ .t — 0. 

dt 1 

The sum of two solutions is again a solution, and so is the constant multiple of one. 
This shows that the set of solutions of this differential equation form a linear space. 

Solutions «f this equation describe the motion of a mass connected to a fixed 
point by a spring. Once the initial position x(0) — p and initial velocity 备 x(0) = v 
arc given, the motion is completely determined for all /, So solutions can be 
described by a pair of numbers (p, y) s 

The relation between ihe iwo descriptions is linear, that is, if (p. y) are the initial 
d;Ua of a solution and (q, w) the initial dala of another solulion y(r), Ihen the 
iriilLLil data of the solution x{f) + y(') are (p + t/ + u') = (p } v) + {q, w). Similarly ， 
the initial data of the solution k.x(i) art (kp^kv) « k{p. v)* 

This kind of relation has been abstracted into the notion of isomorphism. 

Definition. A one-to-one correspondence between two linear spaces over the 
same field that maps ^ums into mms and scalar multiples into scalar multiples is 
called an isomorphism. 

Isomorphism is a bi\s\c notion in linear algebra* Isomorphic linear spaces arc 
indistinguishable by means of operations available in linear spaces. Two linear 
spaces that are presenLed in very difierent ways can be, as we have seen, isomorphic. 

Examples of Linear Spaces, (i) Set of all row vectors: (<?|” " in K; 

addiiion, niiiltiplication defined componentwise. This space is denoted as K n . 

(ii) Set of all real-valued functions f(x) defined on the real line, K = U, 

(iii) Set of all functions with values in K. defined on an arbitrary set S, 

(iv) Set of all pulynomials of degree less than n with coefficienls in K. 

Exercise 3 , Show that (i) and (iv) are isomorphic. 


Exbrcish 4 , Show that if S has n elements, (i) and (iii) arc isomorphic* 
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Exercise 5 , Show that when K — U, (iv) is isomorphic with (iii) when S 
consists of n distinct points of R ， 

DefinitiotL A subset Yota linear space X is called a subspacc if sums and scalar 
multiples of elements of Y belong to Y. 


Examples of Subspaces. (a) X as in Example (i), Y the set of vectors 
( 0 ^ 2 , ” 叫 , 1 , 0 ) whose first and last component is zero, 

(b) X as in Example (ii), Y the set of all periodic functions with period n. 

(c) X as in Example (iii), Y the set of constant functions on S* 

(d) X as in Example (iv), Y the set of all even polynomials* 


Definition. The sum of two subsets Y and Z of a linear space X, denoted as 
Y + Z, is the set of all vectors of form y + z, >' in K z in Z 

Exhrcisr 6 , Prove that K + Z is a linear suhspacc of X if ^and Z arc* 


Definition. The intersection of two subsets Yand Z of a linear space X, denoted 
as y n Z, consists of all vectors x that belong to both Y and Z. 

Exercise 7* Prove that if Y and Z are linear subspaces of X, so is KHZ, 

Exercise 8 . Show lhat the set {0} consisting o( ihe zero element of a linear 
space X is a subspace of X. It is called the trivial subspace. 


Definition. A linear comhimuion of j vectors Ji| ， ■ ■.of a linear space is a 
vector of the form 


k 1 x 1 + " ， + kjXj , k \ '•… 'kj € K* 


Exercise 9 , Show that the set of all linear combinations of a*] ,,,, is a 
subspacc of X, and that it is the smallest subspacc of X containing .V ] "…，今 This is 
called the suhspace spanned by , … 

Definition. A set of vectors ,.v m in X span the whole space X if every x in 
X can be expressed as a linear combination of 出 . 

Definition. The vectors arc called linearly dependem if there is a 

nontriviiil linear relation between them, that is, a relation of the form 

左 lA + - h kjXj = 0, 


where not all A ：| ，… ,kj arc zero. 
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Defimtion, A set of vectors X\, — ^ that are not linearly dependent is called 
linearly imlependent. 

Exercise io. Show that if ihe vectors j ：| s .. arc linearly independent, then 
none of the is Lhe zero vector. 

Lemma 1. Suppose that lhe vectors V| ， span a linear space X and thal the 
vectors >，” …乃 in X are linearly indepentlem. Then 

j < ^ 

Proof. Since span X, every vector in X can be written as a linear 

combination of xj ，… In particu1ar t y \： 

yt - + ••• + Aw 

Since y\ ^0 (see Exercise 10 ), not all k are equal to 0 t say kj ^ 0 . Then jc, can be 
expressed as a linear combination of )，j and the remaining x $ . So the set consisting of 
the x’s，with & replaced by v] spanX \fj> n, repeat this step n — 1 more times and 
conclude that … , y,i span X: if j > n, this contradicts the linear independence of 
the y w % for then y n ^[ is a linear combination of yy □ 

Definition. A finite set nf vectors which span X and arc linearly independeni is 
culled a basis for X. 

Lemma 2* A linear space X which is spanned by a finite set of vectors .ti ” 
has a basis* 

Proof, If xi，" • ,x n are linearly dependent, there is a nontrivial relation between 
them; from this one of the x/ can be expressed as a linear coinbinaiion of the rest. So 
wc can drop that Xf. Repeal this slop uniil the remaining Xj are linear independeni; 
they still span X, unci so they form a hims t □ 


Definition, A linear space X i$ called finite dimensional if it has a basis. 


A finite-dimensioritil space has many, many bases. When the elements of the 
space are represented as arrays with " components, we give preference lo the special 
basis consisting of lhe vectors that have one component equal to l ? while all the 
others equal 0. 


Theorem 3. All bases for a finite-dimensiona) linear space X contain the same 
mimber of vectors. This number is called the dimension of X and is denoted as 


dimJT 





6 


LINEAR AI.C1F.BRA AND ITS APPLICATIONS 


Proof. Let .Vi，,.. } x tt be one basis，and let yj,. ,y m be another. By Lemma I and 
the definition of hiisis wc conclude that m < ru and also n < m. So we conclude that 
n and m ;ire equal. O 

We define the dimension of the trivial space consisting of the single element 0 to 
be zero. 

Theorem 4, Every linearly indepenclent set of vectors 凡… ，力 in a finiie- 
dimensional linear space X can be completed to a basis of X, 

Proof. If do not span X, there is some X| thill cannot be expressed us a 

linear coinbination of V|,,, T ? >y. Adjoin this X| to the /s. Repeat this step until the 
v’s span X, This will happen in less than n steps，u = dim X % because otherwise X 
would contain more than n linearly independent vectors, impossible for a space of 
dimension n. □ 

Theorem 4 illustrates the many diffcrcnl ways of forming a basis for a linear 
space. 

Theorem S- (a) Every subspace Y of a finite-dimensional linear space X is 
finite dimensional, 

(b) Every subspacc Y has a complemcnl in X that is, another subspace Z such 
that every vector x m X can be decomposed uniquely as 

x^= y + z, y in Y,z in Z, (11) 

Furthermore 

dimX = dimF + dimZ. (II)' 

Pwqf. Wc can construct a basis in Fby starting with any nonzero vector ji, and 
then adding another vector V 2 and another, as long as they are linearly independent* 
According to Lemma K there can be no more of these ji than the dimension of X. A 
maximal set of linearly independent vectors yi,» in K spans K and so forms a 
basis of K According to Theorem 4, this set can be completed to form a basis of X by 
adjoining ， ， .. ， Z ff . Define Z as the space spanned by Zj+| ， … ， Z n \ clearly Kand 
Z arc complements, and 

dim X ^ n —j + (n — J) = dirn K + dim Z, □ 

Definition, X is said to be ihe direct sum of two subspaces Y and Z that are 
complements of each oilier More generally X is said to be the direct sum of ils 
subspaccs h， …， if every x in X can be expressed wuqiiely as 


x ^ y\ + ." + y m . 


yj in Yj, 


( 12 ) 
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This relation is denoted as 

X = Y\ © * - - © V m - 

Exercise i i. Prove that if X is finite dimensional and the direct sum of 
h”. ■ ， Y m , then 

d\mX — ^ dim (12)’ 

Definition, An (/i — 1 )-dimens(ona] subspacc of an «-dimentional npucc is 
called n hyperplane. 

Exercise 12 . Show that every iinite-dimensional space X over K is isomorphic 
to K f \n = dim X. Show that this isomorphism is not unique when 〃 is >1. 

Since every "-dimensional linear space over K is isomorphic to K'\ i( fallows ihat 
/uvj linear spaces over the scune field and of the same dimension are isomorphic. 

Note: There arc many ways of forming such an isomorphism; itis not unique. 

The concept of congruence modulo a subspiice. defined below, is a very useful 
tool. 


Definition. For X a linear space, 7 a subspacc, wc say that two vectors X|,X 2 in X 
are congruent modulo Y 7 denoletl 


X* = X 2 mod V, 

ifxi - jci € X- Congruence mod Y h an equivalence relation, that is, it is 

(I) symmetric: i( X\ = then X 2 = x\, 

(ii) reflexive: x = x for all x in X. 

(Hi) iransitive: if jcj = xj.xi = x^r then xj = 

Exercise 13 . Prove (i)-(iii) above. Show furthermore ihut if t, — then 
k,\\ = kxz for every scalar k. 

We can divide elements of X imo congruence classes mod Y. The congruence 
class ce^ntaining ihe vectors is ilie set of all vectors congruem with X\ we denote it 
by W. 


Exercise ( 4 . Show th^it two congruence classes arc either identical or disjoint. 

The set of congruence classeN can be made into a linear space by defining addition 
and multiplication by scalars, as follows: 


{x} + {z} = {x + z} 
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and 

k{x} = {Lx}. 

That is, the sum of rhe congruence class containing x and the congruence 
class containing z is the class containing t + z. Similarly for multiplication by 
scalars. 

Exercise 15 , Show that the above definition of addition and inultiplicaiion 
by scalars is iiulcpcndctu of the choice of rcpixrscntatives in the congruence 
class* 

The linear space of congruence classes defined above is called the quotient space 
of X mod Y and is denoted as 

X{mod Y) or X/Y. 

The folknving example is illuminating: Take X to be the linear space of all row 
vectors ( 叫 ，…， with n components, and lake Y lo be all vectors 
y = (0,0, n *..whose first iwo components arc zero. Then two vectors arc 
congruent mod Y\f( their first two components are equal. Each equivalence class can 
be represented by a vector wUh two compooenis, ihe common components of all 
vectors in the equivalence class. 

This shows that forming a quotient space amounts to throwing away information 
contained in those components that pertain to K This is a very useful simplification 
when we do not need the information contained in the neglected components. 

The nexl result shows the usefulness of quolienl spaces lor counting ihe 
dimension of u suhspacc- 

Theorem 6. Kis a subspace of a finite-dimensional linear space X: then 

dim Y 4 - dim(^f/F) = diniX. (13) 

Proof. Lctji ， .，• ，％ be a basis for K j — dim Y. According to Theorem 4, this sc( 
can be completed to form a basis for X by adjoining Xf 丨 ， .，■ rt — dim We 
claim that 

]} t * + * t } ( 13 ) 

form a basis for X/Y. To show this we have to verify two properties of the cosets 

(ny ： 

(i) They span X/Y. 

(ii) They are linearly independenL 
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(i) Since v ( , ^ . ,x n form £» basis for X, every x in X can be expressed as 

It follows that 

U) = Y1 M 為 ). 

(ii) Suppose that 


>hXb 


^2 ^k{x k } ^ 0, 

This means that 

CkXk = y> J 5 in Y. 

Express v as we get 

^ ^ r/fji « 0. 

Since" ■ ^x fi form a basis, they are linearly independent, and so all the ct and d t 
arc zero. 

It follows that 

So 

dim Y + dim X/Y ~ j ^ n - j ~ n — dim X. □ 

Exercise i6. Denote by X the linear space of all polynomials p ⑴ of degree 
< n, and denote by Y the set of polynomials that are zero at n” " ，❼， j < /i 

(i) Show that K is a subspace of X. 

<ii) Determine dim K 

(iii) Determine dim X/Y. 

The following corollary is a consequence of Theorem 6. 

Corollary 6\ A subspace Y of a finite-dimensional linear space X whose 
dimension is the same as the dimension of X is all of X 



in 


Exercise 17 , Prove Corollary 6 ; , 
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Theorem 7. Suppose X is a finite-dimensional linear space, U and V two 
subspaces of X such that X is the sum of U and V: 

x = u+v. 


Denote by W the intersection of U and V; 

W =un V, 


Then 


dimX - dim <7 + dim V - dim W. 


( 14 ) 


Proof. When the intersection W of U and V is the trivial space {0}, dim W = Q, 
and (14) is relation (11)’ of Theorem 5, Wc show now how to use the notion of 
quotient space to reduce the general case to the simple ease dim W — 0. 

Define Un = U/W, V n ^ V/W t then t/ 0 H Vq = {0}, und %oXo = X/W salisfies 

為二 + Vo 

So according to (11 )’， 

dim Xu — dim + dim Vo. (14)' 

Applying (13) of Theorem 6 three limes, we get 

dim%〗=dimX — dim IV ， dim f/ 0 = dim U - dim W, 
dim V 0 = tlim V — dim IV t 


Setting this into relation (J 4)’ gives (14), □ 

Definition, The Carteskm sum of two linear spaces over the same field is the set 
of pairs 

(X| ， jf 2 ); X| in X h x 2 in X 2 , 

where addition and multiplication hy scalars is defined componentwise. The direct 
sum is denoted as 




It is easy to verify that X\ ® X 2 is indeed a linear space. 



This vector equation is equivalent to four scalar equations; 


ki + *2 + 2fc 3 + 2fc 4 = 0, 

左 I 一 灸2 灸 3 一炎 ■! = () h 

k 2 + Jt 3 = 0 ; 

十左 2 + 3^ 十 3tt = (X 


( 15 ) 


are linearly dependent or not. That is t are there four numbers not all 

zero, such that 


FUNDAMENTALS II 

Exercise 18 . Show that 

dimX] 0^2 = dim^i + dim A, 

Exekcise 19, Xa linear space, subspace, Show that Y © X/Y is isomorphic 
toX- 

Note: The most frcqucnlly occurring linear spaces in this lexi tire our old friends 
R” and C’ 1 , the spaces of vectors (ai ， ..”£4) with n real, respectively complex, 
components. 

So far the only means we have for showing (hat a linear space X h linite 
dimensional is to find a finite set of vectors that span it. In Chapter 7 we present 
another, powerful criterion fora Euclidean space to be finite dimensional. In Chapter 
14 wc extend this criterion to all normod linear spaces. 

We have been talking about sets of vectors being linearly dependent or 
independent, but have given no indication how to decide which is the case. Here is an 
example: 

Decide if the four vectors 




The study of such systems of linear equations is the subject of Chapters 3 and 4, 
There we describe an algorithm for finding all solutions of such systems of 
equations. 
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Exercise 20. Which of the following sets of vectors x — (xj ，…， 〜）in are a 

subspacc of U fi， l Explain your answer 

(a) All x such Lhtit > 0, 

(b) All x such that xi + X 2 = 0. 

(c) All x such that ,i| + + 1 — 0* 

(d) All x such that jc_ = 0, 

(e) All x such that is an integer 

Exercise 21 . Lei U, V, and W be subspaces of some finile-dimensioniil vector 
space X Is the statement 

dim(U +V+W) = dim U + dim V + dim IV - dim(t/ n V) — dim(t/ n IV) 

-iiim(vnw) + dim(unvnw), 

true or false? if true, prove it. If false, provide a counterexample. 



CHAPTER 2 


Duality 


Readers who are meeting the concept of an abstract linear space for the first time 
may balk at the notion of the dual space as piling an abstraction on top of an 
abstraction, I hope that the results presented at the end of this chapter will convince 
such skeptics that the notion is not only natural but useful for expeditiously deriving 
interesting concrete results. The dual of a normeti linear space，presented in Chapter 
14, is a particularly fruitful idea* 

The du[il of an infmite-dimensiomil normed linear space is indispensable for their 
study. 

Let X be a linear space over a field K. A scalar valued funclion l, 

liX — K 、 

defined on X, is called linear if 

l(x + y) = l(x) + l(y) ⑴ 

for all x, v in X, and 

I(tx) — kl(x) ⑴’ 

for all x in X and all k in K, Note that these Iwo properties, applied repeatedly, show 
that 

HMm + … + k n x n ) ^kil(x t ) + - ■ + k n l{x n ), (if 

Wc define the sum of two functions by pointwisc addition; that is, 

(t + /ii)(x) = /(jc) + m(x). 


Linear At^ebm and fo Appticatmis, Second Editimu by Peter D. Lax 
Copyright i 2007 John Wiley & Sons, Inc. 
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Multiplication of a function by a scalar is defined similarly. It is easy to verify that 
the sum of lwo linear functions is linear as is the scalar multiple of one* Thus the set 
of linear functions on a linear space X itself forms a lineur space, called the dual of X 
and denoted by X\ 

Example i. X = ^continuous functions f(s) * 0 < s < 里 }. Then for any point 
51 ill [0, H 


m 

h a linear function. So is 

l (f) = W/), 

i 


where sj is an arbitrary cullectioo of points in [0, I], kj arbitrary scalars. So is 

m= 厂/(牡 


■川 


Example 2 f X = {Diffcrcnliable functions f oti [0, 1]}* For s in [0, V], 

m = I ] 屮咖 

i 

is a linear function, where denotes the yth derivjitive* 

Theorem 1* Let Xbea linear space of dimension n. The elements x of X can be 
represented as arrays of n scalars: 

^ - … ， c n )' (3) 

Addition and multiplication by a scalar is defined componentwise* Lei …” be 
any array of n scalars; the functicn l be defined by 

/(x) = + *，- + a n c n (4) 

is a linear function of x Convcrsicly, every linear runc(ion f of x can be so 
represented. 

Proof. That l(x) defined by (4) is a linear function of a is obvious. The converse is 
not much harder Let / be any linear function defined on X. Define Xj to be the vector 
whose jth component is 1, with all other tromponents zero. Then x defined by (3) can 
be expressed as 


x = cjxi 4 -- hc n Xt 


Denote /(^) by ttj ： it follows from formula (1 f that l is of form (4). 


□ 
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Theorem 1 shows that if the vectors in X are regarded as arrays of n scalars, then 
the elements / of X f can also he regained as arrays of n scalars. It follows from (4) 
that the sum of two linear functions is represented by the sum of the two arrays 
representing the summands. 

Similarly, multiplication of / by a scalar is accomplished by multiplying each 
component- We deduce from all this the following theorem. 

Theorem 2. The dual X* of a finite-dimensional linear space X is a finite- 
dimensional linear space, and 

dim/= dimX 

The right-hand side of (4) depends symmetrically on the two urruys representing 
x and /. Therefore wc ought to write ihc left-hand side also symmetrically, we 
accomplish thal by the scalar product notation 

(“ 严 ，. (5) 

We call it a product because it \s a bilinear function of l and jc: for fixed / it is a linear 
function of x, and lor iixed x i( is a linear fundi on of 

Since X 1 is i\ linear spuce, if has its own dual X tf consisting of all linear functions 
on X r . For fixed x, (t,x) is such a linear function. By Theorem 1, all linear functions 
are of this form. This proves the following theorem. 

Theorem 3< The bilinear function (/, x) defined in (5) gives a naluml 
identification of X with X' 

Exercise i . Given a nonzero vector .vi in X, show that there is a linear I'uncdon / 
such chat 

/ ㈤ 舞 

Definition, Let Kbe a subspace of X. The set of linear functions / that vanish on 
Y 7 that is, satisfy 

l{y) = 0 for all v in Y, (6) 

is called the annihilator of the subspace Y\ it is denoted by Y 1 . 

Exercise 2, Verify that Y L is a subspacc of X\ 

Theorem 4* Let / be a subspace of a finite-dimensional space X, F J its 
annihilator. Then 


dim + dim Y — dimX. 


⑺ 
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Proof. We shall establish a natural isomorphism between K 丄 and the dual (X/Y ) 1 
ofX/Y\ Given l in Y 1 wc define L in {X/Y) f as follows: for any congruence class {x} 
in X/Y, we detine 

L{x] = t(x). ⑻ 

It follows from (6) that this definition of Lis unequivocal, that is, does not depend on 
the element x picked to represent the class. 

Conversely, given any L in (X/Y)\ (8) defines a linear function / on X that 
satisfies (6), Clearly, the correspondence between l and L is one-to-one and an 
isomorphism. Thus since isomorphic linear spaces, have ihe same dimension, 

dim Y 1 - dim(X/F)'. 

By Theorem 2, dim(X/F) f = dimX/Y^ and by Theorem 6 of Chapter 1 1 
dim X/Y — dim X — dim Y, so Theorem 4 follows, 口 

The dimension of Y is called ihe codimension of as a subspace of X. By 
Theorem 4, 

codim Y + dim Y — dim X. 

Since Y J is a subspace of X\ its unnihilator, denoted by Y l . is a subspacc 
of X' 

Theorem 5, Under the identification (5) of and X, for every subspacc Vof a 
finilc-dimcnsiona] space X, 

F a± ^ Y. 

Proof. It follows from definition (6) of the aiinihilatorof Y that all v in Fbelong to 
Y 11 , ihe annihilator of l^ 1 . To show that Kis all of F 丄 ％ we make use ot' (7) applied 
to X r and its subspace Y 1 : 

dim Y 1 - + dim = dimX' (7} / 

Since dim X f = diniX* ii follows by comparing (?) and (7 / that 

dim F 丄丄 =dim Y. 

So Y\s a subspace of Y 11 that has the same dimension as Y 11 ; but then according ro 
Corollary 6^ in Chapter 1, Y= F 1 丄 . □ 
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The following notion is useful: 

Definition. Let X be a tinite-dimensional linear space, and lei S be a subset of 
X. The annihilator $ J of S is【he set of linear funclions / Lhut are zero at all vectors 
s of S: 

l(s) = 0 for s in S\ 

Theorem 6 , Denote by Y the smallest subspacc containing ,S: 

5 1 = r\ 

Exekcise 3 . Prove Theorem 6 . 

According to formalist philosophy, all of matheimitics is taulology. Chapter 2 
mighl strike the reader — as it does the author — as quintessential taulology. Yet even 
this trivial-looking material has some interesling consequences: 

Theorem 7. Let / be an interval on the real axis, 4 n distinct points. 

Then there exist n numbers m\^ .., such that ihe quadrature formula, 

p{i)dt = m^p{t { ) + " ， + (9) 

holds for all polynomials p of degree less than 

Pmof, Denote by X ihe space of all polynomials p(t )= 如 + a〆 + ， • + 1 

of degree less than n. Since X is isomorphic to the space (rio ? «i ? ^-i)= 
R^.dimX — n. We define lj as the linear function 

IjiP) = PUj) ⑽ 

The Ij are elements of the dual space of X: we claim that they are linearly 
independent. For suppose there is a linear relation between them: 

c\li + • * * + c H l n = 0 + ( 11 } 

According to the definition of the / 』， （ J1) means that 

^\p{h ) + … + Cnpf/fj) = () ( 12 ) 

for all polynomials/» of degree less than n. Define the polynomial as the product 

_) = n 卜讣 
m 

Clearly, is of degree n — U Lind is zero m all points tjj / k. Since the poinis fj arc 
distinct, q k is nonzero ai 4 , Set p = in (12); since qk(tj) — 0 for j ^ k, we nbiuin 
that = 0; since qk{ti) is not zero, q inusl he. This shows that all coefficients 

q are zero, that is, that the linear relation (11) is trivial. Thos the IjJ = 1,"are " 
linearly independent elements of X\ According to Theorem 2, dimX' = dimX — n; 




18 


LINEAR AI.C1F.BRA AND ITS APPLICATIONS 


therefore the lj form a basis of X\ This means that any other linear function / on X 
cm be represented as a linear combinalion of the Iji 

The integral of p over / is a linear function of p\ therefore it can be represenled m 
above* This proves that given any n distinct points there is a formula of 

form (9) that is valid for all polynomials of degree less than n. □ 

EXERC ISE 4, In Theorem 6 take the iiUcrval I to be [—I . 1], and take n to be 3, 
Choose the three points to be /] = —a, f 2 = 0, and rj = a, 

(i) Determine the weights m 卜坩 2 , m 3 so that (9) holds for all polynomials of 
degree <3, 

(ii) Show ihat for a> \J I /3, all three weights are positive* 

(iii) Show that for a — ^/3/5, (9) holds for all polynomials of degree < 6 + 


Exercise 5 . In Theorem 6 take the interval / to be 1—1 ， I], and take n = 4. 
Choose the four points to be —a, —bjy, a. 

(i) Determine the weights and 川 4 so that (9) holds for all 

polynomials of degree <4. 

(ii) For what values of a and b are the weights positive? 


Exercise 6 * Let Vi be the linear space of all polynomials 


;j(x) = ^0 + ⑴ x + ^2-^ 

with real coefficients atid degree < 2, Let^, t 与 2 , be three distinct real numbers, and 

then define , . 、 

tj — p(^j) tor j — 1,2.3. 


(a) Show lhat are linearly independenl linear functions on V:* 

(b) Show thaL fh ^ a basis for the dual space Py* 

(c) (1) Suppose {^ 1 ,,.., e n } is a basis for the vector space V Show there exist 

linear functions {£ 1 ” "4) in the dual space V r defined by 


f iWj )= 



if 


Show that {lj ,,,,, 4} is a basis of V f \ called the dual basis. 

(2) Find the polynomials pi{a-),/J 2 {^)./?.i(^) in V 2 which is the 

dual basis in 


Exercise 7 . Lei W be the subspace of U 4 spanned by (1.0, — 1,2) and (2, 3, 

I, 1). 

Which linear functions £(x) = cuti + C 2 X 2 4 - c^x$ + C 4 X 4 are in the annihilator of W? 




CHAPTER 3 


Linear Mappings 


Chapter 3 abstracts the concept of a matrix as a linear mapping of one linear space 
into another. Again I point out that no greater generality is achieved, so what has 
been gained? 

First of all, simplicity of notation; we can refer to mappings by single symbols, 
instead of rectangular arrays of mimbers. The absiract view leads to simple, 
transparent proofs. This is strikingly illustrated by the proof of the associative law of 
matrix multiplication anti by ihe proof of the basic result that the column rank of a 
matrix equals its row rank. 

Many imporlant mappings are not presumed in malrix form; sec, for example, the 
first two applications presented in this chapter. 

Last but not least, the abstract view is indispensable for infinite-dimensional 
spaces. There the view of mappings as infinite matrices has held up progress until it 
was replaced by an abstract concepr. 

A mapping from one set X into another set t/ is a function whose arguments arc 
points of X and whose values are poiais of U: 

/(x) = 

In this chapter we discuss a class of very special mappings: 

(i) Both X, called the domain space 、and U, called the target space, arc linear 
spaces over the same field. 

(ii) A mapping T: X ^ U is called linear if it h iuklirive, that is, satisfies 

T(x^y) = T(x)+T{y) 

for all x, y in X, and if il is homogeneous, that is. satisfien 

T(kx) = kT(x) 


Linear Algebra and Its Appticmiims, Second Editimu by Peter D. Lax 
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for all ^ in X and all k in K. The value of T at x is written multiplicalively as 
T.v; the additive property becomes the dislributivc law: T(x 4- y) — Tx + 'Ty. 

Olher ntimes for linear mapping arc linear transformation and linear operator. 

Example L Any isomorphism. 

Example 2. X = U polynomials of degree less than n in s; T — d/ds. 
Example 5 - X = (7 — T roiation around ihc origin by angle 0 . 


Example 4. X any linear space, U the one dimensional space K, T any linear 
function on X 


Example 5, X — U — Differentiable functions, T linear differential operator* 

i 

Example 6. X = U — Co(R), (T/)(x) = / f(y)(x — y) 2 dy, 

-i 

Example 7. X = U f \ U — u = Tx defined by 

n 

恭以 = 〉: i = 1 ” ， "i* 

i 

Here h = ^i m ), x — (x\ ”" ， x„). 

Theorem L (a) The image of a subspace of X under a linear map T is a 
subspace of U, 

(b) The inverse image of a subspace of U, that is the set of all veciors in X 
mapped by T into the subspace, is a subspace of X. 

Exkrcisk i- Prove Theorem I, 


Definition. The range of T is the image oi X under T; it is denoted as Rj* By 
part (a) of Theorem I, it is a subspace of U, 

Definition, The of T is the sel X mapped inlo 0 by T: Tx — 0; it is 

denoted as Nj. By part (b) of Theorem 1, it is a subspace of X. 

The following result is a workhorse of the subject, a fundamental rcsull about 
linear maps. 

Theorem 2, Let T ： X t/ be a linear map; then 


dim A/x + dim Rj — dimX. 
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Proof Since T maps yv T intoO, Tx^ = 1 >： when 丄 置 and .v 2 are equivalent mod Nj, 
So wc can define T acting on the quotient space X/Nj by setting 

= Tx. 

Tis an isomorphism between X/Nj and Rj ； since isomorphic spaces have the snmc 
dimension ， 

dimX/Nj = dimflj* 

According to Theorem 6 of Chapter 1, dimX/N = dimX — dim iV; combined with 
the relation above we get Theorem 2. □ 

Corollaries A Suppose dim U < dimX; then 

Tx — 0 for some a ^ 0. 

B Suppose dim U = dimX and the only vector satisfying Tx = 0 is x = 0* Then 

Rj = U. 


Pmof. A dimi?j < dim U < dimX: it follows therefore from Theorem 2 rhat 
ditnA^r > 0» that is, that Nj contains some vector not equal to 0, 

B By hypothesis, Nj = {0}, so dim Nj = 0. It follows then from Theorem 2 and 
from the assumption in B that 


dim/?T = dimX = dim i/. 

By Corollary 6' of Chapter 1 1 a subspace whose dimension is the dimension of the 
whole space is the whole space; therefore /fj — U. □ 

Theorem 2 and its corollaries have many applications, possibly more 
than any oilier theorem of mathematics. It is useful to have concrete versions of 
them* 


Corollary A f * X — K ,J , U = m < tu Let T be any mapping of ^ U m as 
in Example 7; since m — dim U < dimX — n, by Corollary A, the system of linear 
equations 

II 

〉 : ^ij^j = 0 ， f = 1 ， * " ， (1) 

I 

has a nontrivial solution, that is one where at least one Xj ^ 0, 
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Corollary B ; . X = R' 1 , U = R #l f T given by 

n 

i 

If the homogeneous system of equations 

n 

妒产 0 ， 

i 

has only the trivial solution x } = … == 0, then the inhomogeneous system (2) 
has a solution for all , u n . Since the homogeneous system (3) has only the 
trivial solution，the solution of (2) is uniquely Jetermined. 


=1 ， * * 




( 2 ) 




Application A Take X equal to the space of all polynomials pis) with complex 
coefficients of degree less than and lake U — We choose S].... ,s n as n 
distinct complex numbers, and define the linear mapping T: X U by 

T"= (pOl) ， … ， PW), 

We claim that Nj is trivial; for T/j — D means that p(s\) — 0” . -, p(s ft ) — 0 + (hut is f 
that p has zeros at ‘f 卜 …” v But a polynomial p of degree less than n cannot have n 
distinct zeros, unless p = 0. Then by Corollary B, the range of T is all of U; that is, 
the values of p at … ^s n can he prescribed arbitrarily* 

Application 2 - X is the space of polynomhils with real coefficients of degree 
< n,U = Wc choose n pairwise disjoint intervals S|,,,, ,5^ on the real axis. Wc 
define pj to be the average value of p over Sf 

Pj ^ p(s)ds, |S；| = leng(h of Sj, (4) 

We define the linear mapping T: X t/ by 

We claim that the null space of T is trivial; for, if pj — 0, p changes sign io Sj and so 
vanishes somewhere in S jr Since the Sj are pairwise disjoint, p would have n distinct 
zeros, too many for a polynomial of degree less than n. Then by Corollary B the 
range of T is all of U: that means that the average values of p over the intervals 
Sh " ” S” can be prescribed arbitrarily* 
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Application J, In constructing numerical approximations to solulions of the 
Laphicc equation in a hounded domain G of the plane. 

All = EI CV + Uyy = 0 in G, (5) 


with u prescribed on the boundary of G, one tills G. approximately, with a lattice and 
replaces the second partial derivatives with centered differences: 


tiw - 

- 2u it + me 

w 以 — 

h 2 

- 

rt rw 

- 2u 0 + us 

【•打一 h 2 1 



and h is the mesh spacing- Seiting (6) inlo (5) gives the following relations: 


Uw + + He + us 



⑺ 


This equation relates the value u n of n at each lattice poim O in the domain G to the 
values of ft at ihc four lattice neighbors of il In case any lattice neighbor of O lies 
outside G, wc set the value of a there equal to the boundary value of u at the nearest 
boundary poim. The rcsulling scl of equations f7) is a system of n equations for n 
unknowns of the form (2): n is equal to the number of lattice points in G. 

We claim that the corresponding homogeneous equations have only the trivial 
solution u 0 — 0 for all lattice points. The homogeneous equations correspond to 
taking the boundary values to be zero. Now take any solution of the homogeneous 
equations and denote by w, im the maximal value of u ty over all lattice points in 6\ 
That maximum is assumed at some point O of G; il follows from (7) that ihcn 
u = u mm m all four lattice neighbors of O. Repeating this argument we eventually 
reach a lattice neighbor which falls outside G t Sinec u was set to zero a【all such 
points, we conclude that u mm — 0. Similarly wc show that — 0; together these 
imply that f/o = 0 for all lattice points for a solution of the homogeneous equation. 
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By Corollary the system of equations (7)，wiih arbitrary boundary data* lias a 
unique solution. 

Exercise 2 . Let 


1 


=1 ”，， ， 〃j 


be an overdetermined syslem oflinear equations — that is, ihe number m of equations 
is greater than the number n of unknowns .vu ,.. ， Take the case that in spite of the 
overdeterminacy, this system of equations has a solulion, and assume that this 
solution is unique. Show that it is possible to select a subset of n of these equations 
which uniquely determine the solution. 

Wc torn now to the rudiments of the algebra of linear mappings, that is, their 
addition and multiplication. Suppose lhat T and S are both linear maps of X — U; 
then we define their sum T + S by setting for each vector x in X, 


(T + S){jt) = Tx + Sx, 


Clearly, under this definition T + S is again a linear map of X U. We define AT 
similarly, and we get another linear map. 

It is not hard to show that under the above definition the set of linear mappings of 
X 一 U themselves forms a linear space. This space is denoted by U). 

Lei T, S be maps, not necessarily linear, of X into U y and U imo K respectively, X, 
U ，Varbitrary sets* Then we can define the composition of T with S, a mapping of X 
into Vobtained by letting T act first, followed by S, schematically 

s T 

V 上 f/ 丄 x. 


The eomposiie i% denoted by S^T: 

SoT(x) = S(TW). 

Note that composition is associative: if R maps V imo Z, ihen 

R.(S^T) = (R^S)oT + 

Thecirem 3. (i) The composite of linear mappings is also a linear mapping* 

(ii) Composition is distributive with respect to the addition oflinear maps, that is, 


(R+S) e T = RoT + SoT 
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and 

S o (T + P) = SJ + SoP. 
where R and S map U ^ V and P and T map X — U- 

ExEkcisi 3 . Prove Theorem 3- 

On account of this distributive property, coupled with the associative law that 
holds generally, composition of linear maps is denoted as muhiplication: 

SoTs ST. 

We warn the reader lhat this kind of multiplication is generally nor commulalive; for 
example, TS may not even be defined when ST is, much less equal to it. 

Example ft, X — U — V — polynomials in s ， T — d/ds, S — multiplication 
by s. 

Example 9. X=U^V = R\ 

S ： rotation around xi axis T ： rotation around x 2 axis 
by 90 degrees by 90 degrees 

Exercise 4 , Show that S and T in Examples 8 and 9 are linear and that 
ST# TS. 

Definition^ A line；Lr map is called invenihh if it is l-to -1 and onto, that is, if it is 
an isomorphism. The inverse is denoted as T 丨 _ 

Exercise 5 , Show that if T h invertible, TT ^ 1 m the identity. 

Theorem 4, (i) The inverse of an invertible linear map is linear 

(ii) If S and T arc both invertible, arid if ST h defined, then ST also is invertible, 
and 

(ST)，’ -T 〖 S' 

Exeirhse 6 . Prove Theorem 4. 

Let T be a linear map X — U\ and / a linear function, that is, l is an elemenl of if • 
Then the product (i t e 4| composite) rr is a linear mapping of X into K, that is, an 
element of denote this element by m: 


m{x) — 


⑻ 
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This defines an assignment of an element m of X r to every element / of U r . It is easy 
to deduce from ( 8 ) lhat this assignment is a linear mapping V f ^ X f \ it is called ihc 
Transpose of T and is denoted by 

Using the ootalion (6) in Chapter 2 to denote Ihc value ol a linear function, wc can 
rewrite (8) as 

= (/ ， Tx). ( 8 ，) 

Using the noEation m = TV, this can be written as 

(TLx) - (l.Tx). ( 9 ) 

Exercise 7 , Show that whenever meaningful, 

(ST)' ^ fx+R ) 1 + and (T- ^ 二 (T 1 )" 1 . 

Example 10. X — R ,1 , V — R /fl , T as in Example 7, 

U i = 〉）#夺 (10) 

U* is then also R m ,X t = U n , with (/,«) = Uu^ (m^x) = HmjXj. Then with 
u — Tr, using ( 10 ) wc have 

(,“0 = 1] W = Hh 响 

J 

々 = n w = ( m ^)i 

where m = T f l t with 

( 11 ) 

/ 

Exercise 8 , Show that if X er is identified with X and U n with V via (5) in 
Chapter 2, then 

TH 

Wc shall show in Chapter 4 that if a mapping T is interpreted as a matrix, its 
transpose T f is obtained by making the columns of T the rows of 7\ 

We recall from Chapter 2 the notion of the annihilator of a subspace* 
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Theorem 5, The annihilator of the range of T is the miIIspace of its transpose: 

^ - % (*2) 

Proof. By the definition in Chapter 2 of annihilator, the annihilator of ihc range 
R\ consists of those linear functions / defined on the target space U for which 

? u) = 0 for all if in 

Since u in Rr consists of u = T.r, t in X, wc can rewrite the above as 

(/ ， Tj) = 0 for all x. 

Using (9), we can rewrite this as 

(Tl.x) = 0 for all jc. 

It follows that / is in fly iff TV = 0; this proves Theorem 5. 口 

Now take the annihilator of both sides of (12). According to Theorem 5 of 
Chapter 2, the annihilator of R 1 is R itself. In this way we obtain the following 
theorem. 

Theorem 5^, The range of T is the annihilalor of Ihe nullspace of T r . 

R T = 吟 (12)’ 

(12/ is a very useful characterization of the range of a mapping. Next we give 
another consequence of Theorem 5. 

Theorem 6, 

diin Rj — dim/? 了〜 (!3) 

Pmof, We apply Theorem 4 of Chapter 2 xo U and its subspace Rr ： 

dim /f T + dim Rj = dim U. 

Next we use Theorem 2 of this chapter applied to 7^: U* X r : 

dim Nt + dim Rj* — dim f/. 


According to Theorem 2, Chapter 2, dim U = dim U f : according to Theorem 5 of 
this chapter, Rj = Njf’ and so dim/?| = dimA^, S() we deduce (13) from the last 
two equations, □ 
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The following is an easy consequence of Theorem 6, 

Theorem Let T be a linear mapping of X into U, and assume that X and <7 
have Ihc same dimension. Then 

( 13 )’ 

Proof. According to Theorem 2, applied to both T anti T\ 

dimAAp = dimX — dim Rj, 
dim Njf = dim U f — dim 

Since dim U ^ dim t/ is assumed to be the same as dim X, (13 / follows from the 
above relations and (f 3). □ 


Theorem 6 is an abstract version of the classical result (hat the column rank and 
row rank of a matrix arc equal The usual proofs of this result arc abstruse and 
unclear 

We turn now to linear mappings of a linear space X into itself. The aggregate of 
such mappings is denoted as they arc a particularly important and 

interesting class of maps* Any two such maps can be added and multiplied, thal is t 
composed, and can be multiplied by a scalar. Thus J^(X,X) is an algebra. We 
investigate now briefly some of the algebraic aspects of 

First wc remark that ^(X,X) is an associative, but not comimitative algebra, with 
a unit: the role of the unit is played by the identity map I, defined by Lr = x The zero 
map 0 is defined by Ox — 0* contains divisors of zero, thal is t pairs of 

mappings S and T whose product ST is 0 t but neither of which is 0, To see this, 
choose T to be any nonzero mapping with a nontrivial nullspace iVr，and S lo be any 
nonzero mapping whose range is contained in Nj. Clearly，TS = 0, 

There are mappings D ^ 0 whose square D l is zero. As an example, take X to be 
the linear space of polynomials of degree less than Z Differentiation D maps this 
space into itself. Since the second derivative of every polynomial of degree less than 
2 m zero, D 2 二0 , but clearly D 一 0. 

Ekerc ISE 9* Show that if A in ^(X,X) is a left inverse of in 文 (X. Y)，that is 
AB = /, then it h also a right universe: BA = L 

We have seen in Theorem 4 thal the produci of invertible elements is invenible. 
Therefore the set of invertible elements of ^(X,X) forms a group under 
multiplication. This group depends only on the dimension of X, and ihe field K of 
scalars. It is denoted as GL(n, K)^ n — dimX, 

Given an invertible element S of M } (X,X) h we assign to each M in Jf(X,X) ihc 
element constructed as follows: 


Ms = SMS]. 


(14) 
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This assignment M ^ M% is called a similanty iransfornuuion; M is said 10 be 
simiUir to Mg ‘ 

Theorem 7. (a) Every similarity iranslbrnialion is an automorphism uf i(U} t 

maps sums into sums, producls into products, scalar multiples into scalar 
multiples: 


(kM) s = kM^ (15) 

(M + K) s = M s + K s . {I5) f 

(MK) S = MsKs ， (I5f 

(b) The similarity transfomiaiions form a group. 

(M s )t 二 M ts . (16) 


Proof. (15) and (15/ are obvious; to verify (15 )〃 we use the definition (14): 

M S K S = SMS" 1 SKS— 1 = SMKS = I = (MK ) S ， 

where we made use of the associative law. 

The verification of (16) is analogous; by (I4) T 

(M S ) T = T(SMS -1 )T -1 = TSM(TS)^ 1 = M TS; 

here we made use of the associative law, and that (TS) _I = S _1 T~ J . 口 

Theorem 8, Similarity is an equivalence relation; that is, it is: 

(i) Reflexive. M is similar to itself, 

(ii) Symmetric. If M is similar to K, then K is similar to M. 

(iii) Transiiive, If M is similar lo K, and K is similar to L ? Uien M is similar 
lo L. 

Proof, (ij is true because wc can in the definition (14) choose S — I. 

(ii) M similar to K means that 

K - SMS 乂 (14/ 

Multiply both sides by S on the right and S _l on the left, and we see that K is similar 
to M. 

(iii) If K is similar to L, then 


L = TKT' 


(14) 
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where T is some invertible mapping. Multiply both sides of (14)’ by T 1 on the right 
and hy on the left; wc get 

TKT] tTSMS^TT 1 . 

According to (14)"，the left-hand side is L. The right-hand side can be written as 

(TS) M(TS)' 

which is similar to M, □ 

Exercise io. Show that if M is invertible, and similar to K, then K also is 
inverlible, and K _1 is similar to 

Mulliplicalion in f ^(X,X) is not commutalive, lhat is, AB is in general not equal 
to BA. Yet they are not totally unrelated. 

Theorem 9. If either A or B in .^{X,X) is invertible, then AB and BA are 
similar 

Exercise i i . Prove Theorem 9* 

Given any element A of ^(X.X) wc can, by addition and multiplictuion, form all 
polynomials in A; 

_A jV + 郎 _| A’、 _ * + " * + (17) 

we can write (17) as p(A), where 

p(s) = + … + 叫 （ 17)’ 

The set of all polynomials in A forms a subal^ebra of .^(X, X); this sobalgebra is 
commutative, Soch commutative subalgebras play a big role in spectral theory, 
discussed in Chuplers 6 and 8 . 

An important class of mappings of a linear space X into itself are projections. 
Definition, A linear mapping P: X ^ X is called a projection if it satisfies 

P 2 = P- 

Example IL X is the space of vectors x = (a\, aj ,., ■, a n ), P defined as 

Px - (0 具 … 為） • 

That is，the action of P is to set the first two components of equal io zer«. 
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Ekhicise 12 . Show that P defined above is a linear map, and that it is a 
projection* 

Example 12, Let X be the space of coruinuous lunctions/in the inlervid [― 1, 1]; 
define Pf to be the even part of/' that is. 


(P/)W 


m +/(— a -) 
2 


Exrrcise 13 * Provo that P defined above is linear, and that it is a projection* 

Definitioih The commutator of two mappings A and B of Jf into X is AB_BA- 
Two mappings of X into X commute if their commutator is zero* 

Remark* We can prove Corollary directly by induction on the number of 
equations m, using one of the equaiions io express one of (he unknowns jtj in lerms of 
the others. By substituting (his expression for Xj imo the remaining equations, we 
huve reduced the number of equations and the number of unknowns by one. 

The practical execution of such a scheme has pitfalls when the number of 
equations and unknowns is large. One has to pick intelligently the unknown to be 
eliminated and the equation that is used to eliminate it. We shall cake up these 
matters in the next chapter. 

Definition, The rank of a linear mapping is the dimension of its range. 

Exercise 14 * Suppose T is a linear map of rank l of a finite dimensional vector 
space into itself. 

(a) Show there exists a unique number c such that T 2 — rT. 

(b) Show that if c ^ 1 then I-T has an inverse. (As usual I denotes the identity 
map lx — x,) 

ExiiRciSE 15 * Suppose T and S are linear maps of a finite dimensional vector 
space into itself* Show that the rank of ST is less than or equal the rank of S* Show 
that the dimension of the oullspace of ST is less than or equal the sum of the 
dimensions of the null spaces of S and of T. 
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Matrices 


In Example 7 of Chapter 3 wa defined a class of mappings T: ■ where ihe 

ith component of u — Ta is expressed io terms of the components xj of a by the 
formula 

If 

W/ — 〉: UjXj 、 = 1 ， " . ， "1 ( I ) 

i 

and the are arbitrary scalars. These mappings are linear: conversely, we have ihe 
following theorem. 

Theorem 1. Every linear map Tx = tj from M n to can be written in form {1). 

Proof. The \ cclor x can be expressed as a linear combination of Ihe unit vectors 
…為 where ej has jth component 1, all others 0: 

尤 ； E 平 i ， ( 2 ) 

Since T is linear 

u = Tx = (3) 

Denote the ilb component of by 

_ ijej) r (4) 


Linear A l^ebm ci/id Its AppHvonons, Second Edition^ by Peii>r D. Lux 
Copvnghi t 200? John Wiie^v & Sons ， Inc. 
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1( follows from (3) and (4) that (he ith component w f of u is 




exactly as in formula ⑴. 


□ 


It h convenient and traditional to arrange the coefficients ^ appearing io (1) in a 
rectangular array. 


?!1 ^12 


h 


hi 

V 


^ 11 , 


(5) 


iJi 


Such an array is called an m by n (m x /;) matrix, m being the number of rows, 
n the number of columns. A matrix (hat has the same number of rows and 
columns is called a square matrix. The numbers fij are called the entries of the 
matrix T t 

According lo Theorem 1 7 there is a I-to-1 correspondence between m x n 
matrices and linear muppings T: R Tl —> R^. We shall denote the (ij)th entry r ❼ of the 
matrix identified with T by 


TV = (TV 


(5) J 


A matrix T can be thought of as a row of column vectors, or a column of ran- 
vectors: 


T = (Ci”" ， C M ) 


n 


\r m 


C J 



n = (Jii ， * ， • ，丈 in ) ， ( 6 ) 


According to (4)，the ith conipoiicnt of T^y is t t j ： according to (6), the ith component 
of cj is tij. Thus 




(?) 


This formula shows that, as consequence of the decision to pu! in the /th row and 
jth column, the image of ej under T appears as a column vector To be consistent, wc 
shall write all vectors \n U — R 7?J as column vectors: 


u 


H| 


\ 
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We shall also write elements of X = as column vectors: 


x = 



The matrix representation (6) of a linear map / from K" to IR is a single row vector 
oi' n components: 


1=QU … Jnh _) = 〜 + … + W (8) 

Wc define hy (8) the product of a row vector r wilh a column vector^ in this order. It 
can be used lo give a compact description of formula (I) giving the aciion of a matrix 
on a column vector: 


Tx- 





where rj,, T , 7 r m are ihe rows of the niiitrix T, 

In Chapter 3 we have described the algebra of linear mappings. Since matrices 
represent linear mappings of U m into M n \ there is a corresponding algebra nf matrices. 
Lei S and T be m x n matrices, represetiling mappings of R Ttt lo Their sum 
T + S represents the sum of these mappings. Il follows from formula (4) that Ihe 
entries of T + S are the sums of the corresponding entries of T and S; 


(T + S)^- = + Sif. 


Next wc show how to use ⑻ and (9) to calculate the elements of the product of 
two malriccs. Let T, S be matrices 

Tr U n W\ S: IT — 

Since the largct space of T is the domain space of S, the product ST is well-defined. 
According to formula (7) applied to ST, the jih column of ST is 


ST 勺 . 

According to (7), Tej ~ C /； applying (9) to x — Te), and S in place of T gives 



where si ； denotes the kih row of S. Thus we deduce Ihis rule: 
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Rule of matrix multiplicatim. Lei T be an m x matrix and S an / x m matrix. 
Then the product of ST is an / x n matrix whose (kj)ih cniry is the product of the Alh 
row of S and the jth column of T: 


(ST) V = s k c Jf 

where & is the k\h row of S and cj is the jih column of T. 
In terms of entries. 


^ r hj = T, s ^ 


(10) 


( 10 / 


Example /, 
Example 2. 


I 2 
3 4 


)MH 


19 22 
43 50 


⑴ 


(3 4) 


8 


Example 3. (3 4) 


⑴ 


(Ji). 


Example 4. (l 2) 


3 4 
5 6 


(13 16) 


—5_ (; 2)00 


Example 6. (I 2) 


3 4 

5 6 / V2 
3 4 
5 6 / \2 


(1 2 ) 


Example 7. 


6 


) 

)(HH 


(13 16) 


(；) 


(45); 


(45) 


23 34 

31 46 


Examples I and 7 show that matrix multiplication of square matrices need nol be 
comniulative. Example 6 is an illustration of the associative property of matrix 
multiplicalion. 


Exercise j + Let A be an arbitrary m x n matrix, and let D be an m x n 
diagonal matrix. 


Dy = 



if / 〜， 

if/#/ 



36 


LINEAR AI.C1F.BRA AND ITS APPLICATIONS 


Show that the /th row of DA equals times the iih row of A, and show that the /th 
column of AD equals dj times the yih column of A, 

An n x n matrix A represents a mapping of into R' If this mtipping is 
inverlible, the matrix A is called invertible. 

Remark- Since the composition of linear mappings is associative, matrix 
multiplication, which is the composilion of mappings from to R’ J, with mappings 
from U m to U 1 , also is associative. 

We shall identify the dual of ihe space of all column vectors with n 
components as the space (W r ) f of all row vecinrs with n components* 

The acticm of a vector / in the dual space (M. n ) r on a vector . 1 * of H #f , denoted 
by brackets in formula (6) of Chapter 2 T shall be taken to be the matrix 
product (8): 

(f,x) = ItJCi + ”• + l n x n . (11) 

Let jc, T and / be linear mappings as follows: 

/: R m ― R, T: ^ R r \ x: R - r. 

According to the associative law, 

(n)x = l{Tx). (12) 

We identify / wiih an element of (M m )\ and /T with an elemem Using ihe 

notation (II) we can rewrite (12) as 

(13) 

Wc recall now the definilion of the transpose V of T, defined by formula (9) of 
Chapter 3, 

(Tl.x) = (l,Tx). (I3y 

Comparing (13) and (13/ we see that the matrix T acting from the right on row 
vectors is the transpose of the matrix T acting from the left on column vectors. 

To represent the transpose T y as a matrix acting on column vectors, wc change its 
rows into columns, its columns into rows, and denote the resulting matrix as J r : 

(T% = T> (13f 

Given a row vector r = (r|,,.. ? r„) f we denote by r T the column vector with Lhc 
same components. Similarly, given a column vector c, c T denotes the row vector with 
the same components. 
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Next we turn lo expressing the range of T in matrix language. Setting (7), 
Cj — TI h into (3), T,v — gives 

U — Tx — X\C[ + … + X n € n , 

This gives the following theorem. 

Theorem 2* The range of T consists of all linear combinaiions of the columns 
of the matrix T + 

The dimension of this space is called in old-fushioncd texts the column rank of T. 
The row rank is defined similarly; (13}" shows that the row rank of T is the 
dimension of the range of 1 7 , Since according to Theorem 6 of Chapter 3, 


dim = dim S T r ? 


we conclude that the column rank and mw rank of a matrix are equal. 

ExHunsB 2, Look up in any texl the proof that the row rank of a malrix equals 
its column rank, and compare il to the proof given in the present text. 

We show now how to represent a linear mapping T: X 一 t/ by a matrix. We have 
seen in Chapter 1 that X is isomorphic to U*\ n ^ dim X, and U isomorphic to U m , 
m — dim U. The isomorphisms arc accomplished by choosing a basis in X, 
yu ^ * - and then mapping yj ^ ej f j ^ I ， ■., ， "： 

B: X ^ U fi ; (!4) 

similarly, 

C: U ^ (14)’ 

Clearly, there arc as many isomorphisms as there arc bases. Wc can use any of these 
isomorphisms to represent T as obtaining a matrix representation M: 


CTB 1 ^ M. (15) 

When T is a mapping of a space X into itsdf. we me the same isomorphism in 
(14) and ( 14/， that is, we take B — C. So in this case the matrix representing T has 
the form 


BTB -1 = M. 


( 15 / 
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Suppose we change the isomorphism B. How does the matrix representing T 
change? If C is another isomorphism X R n \ the new matrix N representing T is 
N = CTC -1 . We can write, using the assocititive rule and (15)', 

N = CT(T l ^CB J BTB^ J BC ^ 1 : SMS' (16) 

where S = CB _, . Since B and C both map X into CB _1 = S maps W l onto U*\ 
that is, S is an invertible n x n matrix* 

Two square matrices N and M related to each other as in (16) are called 
similar. Our analysis shows that similar matrices describe the same mapping of a 
space into itself, in differcnl bases. Therefore we expect similar matrices to have 
the same intrinsic properties: we shall make the meaning of this more precise in 
Chapter 6, 

Wc can write any n x n matrix A in 2 x 2 block form: 

A = 

where An is the submatrix of A umained in the first k rows and columns, A |2 the 
submatrix contained in the first k rows and the last n — k columns, and so on. 

Exercise 3 , Show that the product of two matrices in 2 x 2 block form can be 
evaluated as 

f Aji A|i \/E n Bj 2 A _ f AjiBu + AiaB 2 i A| 1B12 + 、 

\ A21 A22 / \ B?i B22 / \A 2 |Bji + A22B21 AilB |2 + A22B22 / 

The inversion of matrices will be discussed from a theoretical point of view in 
Chapter 5. and from a ruimcrical point of view in Chapter 17* 

A matrix that is 1101 invertihle is called singular. 

DefimtiofL The square matrix 1 whose elemcnls are /y — 0 when / is 
Ijj = 1 is called rhe unit matrix. 

Definition. A square matrix (%) for which = 0 for / > j is called upper 
friangtdar. Lower triangular is defined similarly. 

Definition. A square matrix (/y) for which — 0 when |j —j\>l is called 
iridiagonaL 

EXERCISE 4 , Construct two 2x2 matrices A and B such that AB — 0 but 
BA ^0. 


/A,, A 12 、 

\a 21 a 22 ) 
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We turn now 10 the most important* certainly the oldest, way to solve sets of linear 
equations, Gaussian eUmination. We illustrate il an a simple example of four linear 
equations for four unknowns Xj, xi, 文 3 , and X 4 ： 

Xj + 2x2 + 3 尤 3 — 太 4 = -2 ? 

2 xi + 5 j^2 + 4x^ — 3jc4 = 1 , 

17) 

， 2 xj + 3*V2 + 4:3 + ,V 4 = 15 

jt I + "I - 2 x 3 — 2 .vj — 3. 

We solve this system of equations by eliminating the unknowns one by one; here is 
how ii is done. We use the first equation in (17) to eliminaie aj from the rest of the 
equations. To accomplish this, subtract two times the first equation from the second 
and the third etumtions, oblaining 

X2 — j ^ X4 = 5 , ( 18 )| 

and 

一 Xj 一 + 3 x 4 = 5 S (18)， 

Subtract the fim equation from the fourth one, obtaining 

2 x 2 — xj — X 4 = 5. (18)3 

We use the same technique to eliminate X 2 from the set of three equations (18). 
We obtain 

~4^ + 2a 4 - 10 . (19), 

3 太 3 +J 4 - -5, (19) 2 

Finally we eliminate x% from equations (19) by adding 3/4 times (19)j to (I9) 2 ; we 
get 

^x 4 = 5/2, 

which vields 

X 4 ― L (20} 4 

We proceed in the reverse order, by backward substitution, to delermine the other 
unknowns. Setting the value of .V 4 from (20 ) 4 into equation (19) ( gives 


一 4 jc 3 + 2 = 10 , 
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which yields 



(20)3 


We could have used equation (19) 2 and would have gouen the same answer 
Wc determine xi from any of the equations (18), say (18)】，by using (20)^ and 
( 20)4 for JC 3 and X 4 , We get 

^2 + 4 — I = 5. 
so 

xi = 2 , (20)2 

Finally we determine x\ from, suy，the first equation (17), using the previously 
determined values of X 4 t X 3 , and xj ： 

jci +4-6—1 = -2' 

A = 1- (20), 


Exercise 5, Show that 1 卜 j ： 2, jtj, and .1*4 given by ( 20 )^ satisfy all four equations 

( 20 ). ~ 

Notice that the order in which we eliminate the unknowns, along with the 
equations which we use to eliminate them, is arbitrary. We shall return to these 
points. 

A system of n equaEions 


〉 : = ， y = 1 ， ■ ■ ” ” （ 21) 

1 

for n unknowns ^ 1 ，…” may have a unique sololion, may have no solution, or may 
have many solutions. We show now how to use Gaussian elimination to determine all 
solutions, or conclude that no solution exists. Here is an example that illustrates the 
last two possibilities. 


X| + X2 十 2xj + 3^4 = Hi, 
X\ 4 - 2X2 + 3.t3 + X4 = tl 2 , 

2 x\ + xi 4 - 2 f 3 + 3xi —, 

3xi + 4^2 + 6xi + 2x4 = ^4* 


(22) 
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We eliminate X\ from the Iasi three equations by subtracting from them an 
appropriate multiple of the first equation: 

X2 + — 2.X4 ^ M 2 ~ ll \ 

—Jf 2 — 3^4 — Ml ™ 2 lt] 

X2 — 7JC4 ^ — 3li] 

Wc use the first equation above to eliminate Ai from the last two: 

—X3 — 5X4 = W3 + W2 一 
一 Jt3 — 5x4 — f /4 — 142 ~ 2 fl| 

We eliminate .v^ by subtracting the last two equations from each oLhen We find that 
thereby we have eliminated as well, and we get 

0 = W4 — i<3 ^ 2^2 + w】. ( 23 ) 

This is the necessary and sufficieiu condition for the system of equations ( 22 ) to have 
a solution. 

ExiiKt iSE 6 . Choose values of ii! ， i/ 2 , n 叫 so that condition (23) is satisfied, 
and determine all solutions of equations ( 22 ). 

Equation (22) can be written in matrix notation as 

M.v ^ u, (22/ 

where .v and u are column vectors with components xj + X 2 f -V 4 and wj* 

and 

/ 1 1 2 3\ 

M= 1231 
2 12 3 

\3 4 6 2 / 

Exercise 7 , Verify that / = (1 ■ 一 2, 一 1 ， 1 ) is u left null vector of M; 

/M ^ 0. 

Multiply equation (22/ on the left by /; using the result of Exercise 7, we gel that 

IMx = la — 0, 


a rederivalion of (23) as a necessary condition for (22/ to have a solution. 
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ExtRasE 8* Show by Gaussian elimination that the only left mil I vectors of M 
arc multiples of / in Exercise 7, and then use Theorem 5 of Chapter 3 to show that 
condilion (23) is sufficient for the solvability of the system (22)* 


Next we show how to use Gaussian elimination to prove Corollary A ; in 
Chapter 3: 

A system of homogeneous linear equations 

it 

t(jXj = 0. f = 1 ， • • ■ ， in, (24) 

j=i 

with fewer equations ihm unknowns, "j < ", has a nontrivial solution—that is, one 
where al least one of the ^ is noazero. 

Proof, We use one of the equaiions (24) to express xj as a linear function of ihe 
rest of the x~s: 


Wc replace jcj hy /| in the remaining equations, and wc use one of them to express 
as a linear function of the remaining x’s: 

^2 = 4( 文3,…，為)* 

We proceed in this fashion until we reach x m : 

， ■, ■ (25) m 

Since there were only m equations and m < n, there arc no more equations left to be 
satisfied. So we choose the values of 1 . …,知 arbitrarily, and we use equations 
{25) m , (25) m |(25).，in this order, to determine the values , "” JT|. 

This procedure may break down at ihe ilh step if none of the remaining equations 
contain Xj. In this case we set "… a ；, equal to zero, assign an arbitrary value to 
and determine xj-i, … yi from equations (25)^ ， （ 25 ) 卜 in this 

order 口 

Wc conclude this chapter with some observations on how Gaussian elimination 
works for determined systems of n inhomogeneous equations 

n 

tfjXj « Hi, / « l ，■…， I (26) 

i-i 

for n unknowns X\,,,, y x n . In its basic form the first equation h used to eliminate Jt| f 
that is, express it as 




(27) 







MATRICES 


43 


Then ^i is replaced in the remaining equations by v ( + /].The first of these equations 
is used lo express .V 2 as 


jc 2 = v 2 + 6( 幻 ，… ”A). 


( 27)2 


We proceed in this fashion until after (ir — 1) steps we find the value ot Then we 
determine the values of in ihis order, from ihe relations 

(27\ n … J27)_. 

This procedure may break down right at ihe start if (he coet'ficiem tn of X] in the 
first equation is zero. Even if fu is not zero bul very small, using the first equation io 
express ,V| in terms of the rest of the xs involves division by t\\ and produces very 
large coefficients in formula (27) |. This wouldn’t matter if all arithmetic operations 
were carried out exactly, but they never are; they are carried out in finite digit 
floating point arithmetic, and when (27)| is substituted in ihe remaining equations, 
the coefficients i > arc swamped* 

A natural remedy is to choose another unknown, Xj, for elimination and another 
equation to accomplish \u so chosen that ^ is not small compared with the other 
coefficients* This strategy is culled complete pivoting and is computationally 
expensive. A compromise is to keep the original order of ihe unknowns for 
elimination, but use another equation for elimination, for which / f ] is not small 
compared io the other coefficients. This strategy, called partial pivoting, worLs very 
well in practice (see, e.g” the text entitled Numerical Unear Algebra^ by Trefethen 
and Bau*} 
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Determinant and Trace 


In this chapter wc shall use the intuitive properties of volume to define ihe 
determinant of a square matrix. According to the precepts of elementary geometry, 
the concept of volume depends on the notions of length and angle and in particular, 
perpendicularity, concepts that will be defined only io Chapter 8. Nevertheless, ii 
turns out rhat volume is independent of all these things, except for an arbitrary 
multiplicative constant that can be fixed by specifying that the unit cube have 
volume one. 

We sturl wilh the geometric motivation and meaning of determinants. A simplex 
in K J, is a polyhedron with n + 1 vertices. We shall take one of the vertices to be ihe 
origin and denote the rest a. f a tl . The order in which the vertices are taken 
matters, so wc call O t ai “ "the vertices of an ordered simplex. 

We shall be dealing with two geometrical atiributes of ordered simplices, Lheir 
orienuition and volume. An ordered simplex S is called degenerate if it lies on an 
("- 1).dimcnsional subspacc. 

An ordered simplex (0 T 以 _,.,, ， a n ) — S lhat is nondegeneraie can have one of two 
orientations: positive or negative. We call S positively oriented if it can be deformed 
continuously and nondcgcncralely inlo the standard ordered simplex 
where €j is the jih unit vector in the standard basis of K'_. By such deformation wc 
mean n vector-valued continuous functions (/ L ) of f, 0 < f < L such that (i) 
5(r) = (0. ciyii), ,. - j^ fr (f)) is nondegenerate for all t and (ii) cij(0) — d^(l) — ej. 

Otherwise S is called negatively oriented* 

For a nondegenerate oriented simplex S we define O(S) as +1 or — 1 1 depending 
an ihe orientation of S. and zero when S is degenerate. 

The volume of a simplex is given by the elemeniary formula 

Vol(S )= 丄 VoU_i(Base)AltiliKk, (l) 

n 
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By base we mean any of the (n - l )-dimensional faces of S s and by altitude we mean 
the distance of the opposite vertex from the hypcrplane that contains the base, 

A more useful concept is signed volume, denoted as ^(5), and defined by 

^ ； (S)-0(S)Vol(S), (2) 

Since S is described by its vertices, 5 ^( 5 ) is a function of ai 3 … , a ti r Clearly, when 
two vertices arc equal, S is degenerate, and therefore we have the following: 

(i) £(S) = 0if^ = ^, 

A second property of J^(S) is its dependence on aj when the other vertices arc 
kept fixed: 

(ii) 2(5) is a linear function of aj when the other are kept fixed* 

Let us see why we combine formulas (1) and (2) as 

(S) = - Vol^ifbascife, (1)^ 

^ n 

where 

k - 0(5)AUiiudc. 

The altitude is the distance of the vertex af we call k the signed distance of the 
vertex from the hypcrplane containing the base, because 0(5) has one sign when aj 
lies on one side of the base and the opposite sign when a ； lies on the opposite side. 
We claim that when the base is fixed, k is a linear function of To see why this js 
so we introduce Cartesian coordinate axes so that first axis is perpendicular to the 
base and the rest lie in the base plane* By definition of Cartesian coordinates, the first 
coordinate k \ (ii) of a vector a is its signed distunce from the hyperplane spanned by 
the other axes. According lo Theorem 1 (i) in Chapter 2 f / ： i(«) is a linear fund ion of 
a. Assertion (ii) now follows from formula (1 )\ 

Determinants are related to the signed volume of ordered si in pi ices by the 
classical formula, 

(3) 

where D is the abbreviation of the dererminant whose columns are 〜 … ， a t ^ Rather 
than start with a formula for the determinant, we shall deduce it from the properties 
forced on it by the geometric properties of signed volume. This approach to 
delenniiiants is due lo E. Artin. 

Property (i). D(aj,.,.. = 0 if^| aj f i ^ j. 
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Property (ii). /)(%，•■■, a it ) is a mu hi I inear function of its arguments, in (he 

sense that if all i j arc fixed* D is a linear function of the remaining 
argument 

Property (iii). Normalization: 

峰 l ” " =I (4) 

We show now that all remaining properties of D can be deduced from those so far 
postulated* 

Property (iv). O is an alternating function of its arguments, in the sense that if 
a t and arc interchanged, / 白 ■. the value of D changes by the factor (— 1). 

Proof. Since only the /th and j\h arguniem change, we shall indicate only these* 
Setting ai 二 fl ， = /， we can write, using Properties (i) and (ii): 

D(a, b) = D(a y b) + D(a y a) — D(a : a + b) 

= D(a,a + h) — D(a -j- b, a + h) 

— —D{bui b) — —D{bjd) — D{h % b) ~ —D{h,a). □ 

Property (v). If ⑴ ” ， ” 叫 , arc linearly dependent, then D(a \, *. — 0. 

Proof. If til_. *, a n are linearly dependent, then one of them, say «i，can be 
expressed as a linear combination of the others: 

Cl] — k2d2 +- h 

Then, using Property (ii), 

/)(£!【，"”％) = fHk 2 a 2 H - h k u a !t , a 2 , … 

By property (i) ? all lerms in the last line arc zero. □ 

Next we introduce the concept of permutation. A permutation is a mapping /; of n 
objects, say the numhens omo ihemselves. Like all functions, 

permuiaiions can be composed. Being onto，they ;ire one-to-one and so can be 
inverted. Thus they form a group: these groups, except for n — 2 t arc 
nonconimutative. 

We denote p(k) as it h convenient u> display the action of p by a table: 


] 

2 

… 

n 

P\ 

Pi 

… 

Pa 
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Example L p 


1234 
241-V 


Then 



_ 1234 

1234 

r = 

^4321 J 

P '1142, 


1234 

4 1234 

广 = 

— 3142 1 

P = 而 . 


Next we introduce the concept of signature of a permutation, denoted as a{p). Let 
_V] + , be n variables ； their discriminan! is defined to be 


尸 ( 工 1 ， * ■ = J~J('V - 

*<j 

Let p be any permutation. Clearly, 

P{P^\ ， * " ， 知 ））= - X pj ) 

i<J 

is either P(aj ，- , ,x n ) or —/^v 】 ,x„), 

DefimHotL The signature a[p) of a permutation p is defined by 

Properties of sigruititre: 


(a) 

(b) 


a{p) — +1 or — L 

^(pi~p2) = o(fh)a(p2}> 


(5) 


( 6 ) 


(7) 


Exercise i. Prove properties (7). 

We look now at a special kind of permutation, an interchange. These are defined 
for any pair of indices,/, k^j^k as follows: 

p{i) = i for / ^ jovk, 

K/) — I P(J0 — / 

Such a permutation is called a transposition. We claim that transposition has the 
following properties: 

(c) The signature of a transposition t is minus one: 

_) = -L ⑻ 

(d) Every permutation p can be written im a composition of transpositions: 


P = 0 ' • ' 0 t {. 


(9) 
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Exercise 2 , Prove (c) and (cl) above. 


Combining (7) with ( 8 ) and (9) we get that 

- 1)' (10) 

where k h the number of factors io the decomposition (9) of p. 

Exercise 3 , Show that the decomposiiion (9) is not unique, bul that the parity 
of the member k of factors is unique. 

Example 2. The permutation p — is the product of three Iranspositions 
* _ 12345 u ^ 12345 “ — I234S. 

* 12543 > ^ 21345 ? * 42315* 

P = + 

Wc return now to the function D. Its arguments arc column vectors 



This is the same as 

— a\j€\ (11 / 

Using Property (ii) ? multilinearity, we can write 

Z)(«| f .. + Wfi) = /)( 沒 + ”* + ” . ，拓 f) 

=ctwDiei.ai^ . .,a n ) + , -, + a n \D(e n ,a2 i . * ” 沒 h ). (^) 

Next we express aj as linear combinalion of ... ,e H and obtain a formula like 
(12) but ccmuiining rt 2 terms. Repealing this process n times we gel 

a fi 2 " ,)( 印 !■._，％)， ( 1 3 ) 

/ 

where the summjitkm is over all functions / mapping { 1 ” •，. h} into {1”.. ,/iJ* 
If the mapping / is not ti permutation, then =fj for some pair f ^ j and by 
Property (i)* 

）= 0+ (14) 

This shows that in (13) we need sum only over those / that are permutations. 

Wc saw earlier that each permutation can be decomposed into k transpositions 
(9), According to Property (iv), a single transposition of its arguments changes the 
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value of D by a factor of (- 1 Therefore k transpositions change it by the factor 
( 一 l/. Thus* using (1(1). 

D(e P9 = cr(/?)D(ei ，…，〜） (15) 

for any permutation. Setting (14) and (15) into (13) we get, after using the 
normalizaiion (4), that 

1 ， 化） = 〉: 1 * ■ ■ Up n n- (16) 

This is the formula for D in terms of the components of its arguments. 

Formula (16) was derived using solely properties (i), <ii), and (iii) of 
determinants. Therefore wc conclude with the following theorem. 


Theorem I. Properties (i) ， (ii), and (iii) uniquely determine the determinant as 
a function of €ij T .*. 

Exercise 4 . Show that D defined by (16) has Properlies (ii), (iii) and (iv). 

Exercise 5 , Show that Property (iv) implies Property (i)> unless the field K has 
characteristic two, that is, I + 1 = 0 . 

Definition. Let A be an ir x n matrix: denote its column vectors by 
ai” _ ■ 為 ： A = (“h " . Its determinant, denoted as det A, is 

(let A I IT) 


where D m defined by formula (16). 


The determinant has properlies (i)-(v) tliat have been derived and verified for the 
function D. Wc slate now an additional important property. 

Theorem 2. For all pairs of n x n matrices A and B, 

det(BA) = det A det B+ (18) 

Proof. According to equation (7) of Chapter 4, the)th column of BA is (BA)^；. 
The yth column aj of A is A^; therefore the jlh column of BA is 

(BA)t ? y = BA^y = Ba/. 

By definition (17), 

det(BA) = D(Ba 卜 -… (19) 
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We 脱 ume now thaL det B / 0 and define the function C as follows: 


( 7 (^ 11 , * * - , Q n ) 


dct(BA) 
dct B 


( 20 ) 


Using (19) we can express C as follows: 


I ， > *» ， a") 


detB 


( 20 )' 


Wc claim that the function C has Properties (i)-(iii) postulated for D. 

(i) If £/, — cij y i -f j, then Ba t — Baj ； since D has Property (i), it follows that the 
right-hand side of (20)’ is zero. This shows that C also has Property (i) ¥ 

(ii) Since B« f - is a linear function of and since D is a multilinear function, 
it follows that the righi-hand side of (20/ is also a muliilinear fuuciion. This shows 
that C is a multilinear function of ，• y a m lhar is, has Property (ii). 

(iii) Setting ti / = i — 1，2 ”..，/i into formula (20 广 we get 




fl(Be “… ， B〜) 
detB 


( 21 ) 


Now Be^ is the /th column b- t of B t so that the right-Kand side of (21) is 


D(b 


I ， 


K) 


detB 


( 22 ) 


By definition (17) applied to B, (22) equals 1; setting this into (21) we see that 
C(ei ”. = 1. This proves that C satisfies Properly (iii). 

We have shown in Theorem 1 that a function C that satisfies Properties (i)-(iii) is 
equal lo the function D. So 


C(a I,..., a n ) — D{ct\tin) — det A. 


Setting this into (20) proves (18)，when det B / (X 

When detB = 0 we argue as follows: define the matrix B(0 as 


B(l) = B + rl. 


Clearly, B(0) = B - Formula (16) shows that D(B(r)) is a polynomial of degree ju and 
thiit the coefficient of f equals one. Therefore, D(B(/)) is zero for no more than n 
values of /; in particular D(B(f)) ^ 0 for all t near zero but not equal to zero. 
According to what we have already shown, der{B(/)A) — del A det B(r) for all such 
values of t; letting t tend to zero yields (18), □ 
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Corollary 3, An n x n matrix A is invenible iff det A / 0. 

Proof. Suppose A is not invertible; then its range is a proper subspace of The 
range of A consists of all linear combinations of the columns of A; therefore the 
columns are linearly dependent. According to property (v), this implies that 
det A =0. 

Suppose, on the other hand，thal A is invertible; denote its inverse by B: 

BA-/. 

According to Theorem 2 

det B det A — det L 

By property (iii), det / = 1; so, since D(/) = 1, 

det B det A = 1 ， 

which shows that det A ^ D. □ 

The geomelric meaning of Ihe multiplicative property of determinants is this: the 
linear mapping B maps every simplex omo another simplex whose volume is |det B 
limes the volume of the original simplex. Since every open set is the union of 
si m pi ices, ii follows that the volume of the image under B of any open set is |dct B 
times Ihe original volume. 

Wc turn now to yet another property of determinants. We need the following 
lemma. 

Lemma 4. Let A be an n x n niairix whose first column i$ ej ： 

/I 

() 

A =. 

\0 

here A u denotes [he (n — 1) x (n - I) submairix formed by entries i > hj > 1, 
We claim that 

det A = det A ii, (24) 

Proof. As first step we show that 

/ 1 0 …0、 

det A — det ( 0 An j . (25) 


XXX 


A 


ii 


(23) 
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For it follows from Properties (i) and (ii) that if we alter a matrix by adding a 
multiple of one of its columns to another, the altered matrix has the same 
determinant as the original. Clearly, by adding suitable multiplies of the first column 
of A to the others we can turn il into the matrix on the right in (25). 

We regard now 

C(A] {) — del 

as a function of the matrix A| ■. Clearly it has Properties (i)-(iii). Therefore it must 
be equal to del Ay, Combining this with (25) gives (24). 口 

Exercise 6 , Verify that C(A n ) has properties (i)-(iii). 

Corollary 5. Let A be a matrix whose jih column is e/. Then 

dctA-t^iy^dctAy, (25)' 

where A,j is the (n - 1) x (/i - 1) matrix obtained by striking out the i'th row andyth 
column of A: Ay ； h called the (ij)th minor of A. 

Exercise 7 . Deduce Corollary 5 from Lemma 4* 

We deduce now Ihe so-called Laplace expansion of a determinant according to Us 
columns. 

Theorem 6, Lei A be :my n x n matrix arul j any index between I und n. 
Then 

dctA = J^(-I dctAy- (26) 

i 

Proof, To simplify notation, wc take j — L Wc wrile a\ as a linear combination 
of .standard unit vectors: 

Ui = a\\ei + “ * + aaiCpj. 



Using multilinearily, we get 

det A = D(c?i ， … 為 } = + - h 〜 iA ， 叱 ，…， 〜) 

=«i! 0{e\ ，私 … 為 ） + … + a tl iD{e^Q 2 , 冰 


Using Corollary 5* we obtain (26)* 


□ 
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We show now how determinants can be used to express solutions of systems of 
equations of the form 

人 v = 11 ， (27) 

A an invertible n x n matrix. Write 

x = J2w 

according to (7) of Chapter 4, At^ — aj, the jlh column of A. So (27) is equivalent to 


= ( 27 / 

j 

We consider now the matrix A& obtained by replacing the k\h column of A by u: 
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This is called Cramer's rule for finding the solution of the system of equations (27), 
Wc now translate (29) into matrix language- 

Theorem 7. The inverse matrix A -1 of an invertible matrix A has the form 


(A-% = (-!r 


.乂 


del A# 
det A 


(30) 


Proof Since A is invertible, det A / 0. A _ ads on the vector u\ see formula (l) 
of Chapter 4, 




(31) 


Using (30) in (31) and comparing it to (29) we get that 


{A^u) k = x kj 


n. 


(32) 


that is. 


A" m = x. 


This shows that A' 1 as defined by (30) in indeed the inverse of A whose action is 
given in (21). 

We caution that reader that for n > 3, formula (30) is not a practical numerical 
method for inverting matrices. 

Exercise 8 , Show that for any square matrix 


det = det A, A t — transpose of A. (33) 

[Hint: Use formula (16) and show Lhai lor any permuUHion a(p) 一咖1 ，] 


Exercise g. Given a permutation p of n objects, we define an associated so- 
called permutation matrix P as follows: 


I 1. i*7 = p(0t 

\ 0, otherwise* 


(34) 


Show that the action of P on any vector a* performs the permutation p on the 
components of x Show that if/ j, q are two permuiations and P, Q are ihe associated 
permutation matrices, then the permutation matrix associated with p^q is the product 

PQ, 
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The determinant h an important scalar-valued function of n x n matrices 
Another equally iinportam scalar-valued function is ihc trace. 

Definition. The trace of a square matrix A, denoted as ir A, is the som of the 
entries on its diagonal; 

trA = ^2 a ti* (35) 

Theorem 8 . (a) Trace is a linear function: 

trA A = ktrA, tr(A + B) — ir A + tr B* 

(b) Trace is 4t commutative , '; that is ， 

tr(AB] = tr(BA) (36) 


for any pair of matrices* 

Proof, Linearity is obvious from definition (35)* To prove part (b), we use the 
rule，[see (10} ; of Chapter 4] for matrix mukiplication: 

k 

and 

(BA)^ ~ 

k 

So 

tr(AB) ^ ^ a ik h ki = b ik a ki = tr(BA) 
i,k U 


follows if one imerchanges the names of the indices i, k. □ 

We recall from the end of Chapter 3 the notion of similarity. The matrix A is 
called similar to the matrix B if there is an invertible matrix S such that 

A = SBS' (37) 

We recall from Theorem 8 of Chapter 3 that similarity is an equivalence relation; 
that is，it is the following: 

(i) Rellexive: A is similar to itself* 

(ii) Symmetric: if A is similar Id B, B is similar to A t 

(iii) Transitive: if A is similar to B, and B is similar to C, then A is similar to C, 
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Theorem 9. Similar matrices have the same determinal aod the same trace. 


Proof. Using Theorem 2, we get from (37) 

dct A = (del S)(dct B)(det S _1 ) = (del B)(dct S)det(S -1 ) 

= detB det(SS _1 ) = (detB)(detI) =defB, 

To show the second part we use Theorem 7(b); 

trA = tr(SBS^) = tr({SB)S^ 1 ) = tr(S =I (SB)) = trB. 口 

Al the end of Chapler 4 we remarked thtil any linear map T of an /j- dimensional 
linear space X into itself can, by choosing a basis in X, be represented as an n x n 
matrix. Two different representations, coming from two different choices 
of bases, are similar. In view of Theorem 9, we cm define the determinanl 
and trace of such a linear map T as the determinant and trace of a matrix 
representing T, 

Exercise io* Let A be an m x n matrix. B an « x m matrix. Show that 

trAB = tr BA. 

Exercise i i. Let A be an « x « matrix, its transpose. Show that 

irAA 7 = 

The square rool of the double sum on the right is called the Euclidean, or Hilbcrl- 
SdmiidU norm of the matrix A* 

In Chapter 9, Theorem 4, we shall derive an interesting connection between 
determinant and trace. 

Exercise 【2. Show that the determinant of the 2x2 matrix 

㈡ ） 

i$ D — ad - he. 

Exercise 13 , Show that the determinant of an upper triangular matrix, one 
whose elements are zero below the main diagonal, equals the product of its elements 
along the diagonal. 
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Exercise 14 . How many multiplicaiions doei; it take to evaluate del A by using 
Gaussian elimination to bring it into upper triangular from? 

Exercise 15 . How many multiplictitions does 11 take to eviilutile tltitA by 
formula (16)? 

Exercise 16 . Show that the determinant of a (3 x 3) malrix 



can be calculated as follows. Copy the first two columns of A as a fourth and fifth 
column: 



det A = aei + hfg + cdh — gee — hfa — idh. 


Show that the sum of the products of the three entries along the dexter diagonals, 
minus the sum of the products of the three entries along the sinister diagonals is 
equal to the determinant of A. 



CHAPTER 6 


Spectral Theory 


Spectral theory analyzes linear mappings of a space into itself by decomposing them 
into their basic constituents* We start by posing a problem originating in the stability 
of periodic motions and show how to solve it using spectral theory. 

We assume that the suite of the system under study can be described by a finite 
number n of parameters; these we lump into a single vector x in U 11 . Second, we 
aBBume that the laws governing the evolution in time of the system under study 
determine uniquely the slate of the system at any future time if the initial slale of ihe 
system is given. 

Denote by a the state of the system at time t = 0: its stale at / = I is then 
completely determined by jc; we denote it as F{.r)* We assume F to be a differentiable 
function. We assume that the laws governing the evolution of the system are the 
same at all times; it follows then that if the state of the system at time r — J is z, iis 
stare at time / = 2 is F(z). More generally, F relates the stare of the system at time f ro 
its state at f + 1 + 

Assume that the motion starting at.v 一 0 is periodic with period one, that is that it 
returns to 0 ai time t = l, Thai means lhat 

H{0) = 0, (i) 

This periodic motion is called stable if, starling at any point h sufficiently close to 
zero, the motion tends to zero as t tends to infinity. 

The function F describing the motion is differentiable; therefore for small /l F{h) 
is accurately described by a linear approximation; 

F(h) - A/r (2) 

For purposes yf this discussion we assume that F is linear fund ion 

F(/i) ^ Ah. (3) 


Linear Ai^ebnt ami in Appikadom, Second Etiitioti, by Peter D, Lax 
Copyrighi t 2(X)7 John WWcy & Sons, Inc. 
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A an ^ x n matrix. The system starting at h will, after the elapse of units of time, 
be in the position 

A N !h (4) 

In the next few pages wc investigate such sequences, that is, of the form 

h. Ah, …， A a /i ， •… (5) 

First a few examples of how powers A jV of matrices behuve; we choose N = I024 T 
because then A v can be evaluated performing ten squaring operations; 


Case 

(a) 

(b) 


/3 2\ 

/ 5 6-9 

A 

(i J 

k_3 -4 

A 1024 

>!0 700 

<I(T 7S 


These numerical experimems slrongly suggest lhai 

(a) A v —* oo as AT —^ oc, 

(b) A — 0 as jV _ oo, that each entry of A s tends to zero. 

We turn now to a theoretical analysis of the behavior of sequences of the form (5). 
Suppose that a vector h ^ 0 has the special property wilh respect to the matrix A that 
A/j is merely a multiple of h: 

Ah — ah' where a is a scalar and h ^ 0. (6) 

Then clearly 

A N h - a N h, (6) Ar 

In this case the behavior of the sequence (5) is as follows: 

(!) If |a| > l, A N h —^ cx>. 

(ii) If \a\ < i, A"A 一 0. 

(iii) U ， A s h = h for ail N. 

This simple analysis is applicable only if (6) is satisfied A vector h satisfying (6) 
is called an eigenvector of A; a is called an eigenvalue of A. 

How fcirfeiched is it to assume that A has an eigenvector? We shall show that 
every n x /; matrix over Ihe field of complex numbers has an eigenveclor. Choose 
any nonzero vector w and build the following set of « + l vectors: 


vi\ Aw ， A 2 u ，, …， A ft UK 
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Since n + 1 vectors in the rt-dimeniional space C n are linearly dependent, there is a 
nontrivial linear relation hctwcon them: 

ii 

cjA^w ^ 0, 
o 

not all cj zero. We rewrite this relation as 

p(A)w = 0 , ( 7 ) 

where p(t) is the polynomial 

o 

Every polynomial over the complex numbers can be wriuen as a product of linear 
factors: 

P( f ) = c Jl( x ~ a j)^ 47 一 0 . 

p(A) can be similarly factored and (7) rewritten ■ 

c JJ(A ajl)w = 0. 

This shows that the product 11( A — ajl) maps the nonzero vector w into 0 and is 
therefore not invertible. According to Theorem 4 of Chapter 3, a product of 
inverlible mappings is invertible. It follows that at leasl one of the matrices A — ajl is 
not invertible; such a matrix has a nonlrivial nullspace. Denote by It any nonzero 
vector in the nullspace: 

(A - a])h - 0, a = aj. (6} f 

This is our eigenvalue equation (6). 

The argument above shows that every matrix A has al least one eigenvalue, but it 
does not show how im"iy or how lo calculate them. Here is another approach. 

Equation (6 / says that h belongs to the null space of (A - ^1); therefore the 
matrix A — is not invertible. We saw in Corollary 3 of Chapter 5 that this can 
happen if and only if the determinant of the matrix A — al is zero: 

dcl{al 一 A) = 0. (8) 

So equation (8) is necessary for a to be an eigenvalue of A. It is also sufficient: for if 
(8) is satisfied, the matrix A — al is not invertible. By Theorem 1 of Chapter 3 this 
nooinvertible matrix has a nonzero null vector A; ⑹’ shows That h is an eigenvector 
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of A. When the determinant is expressed by formula (16) of Chapter 5 , ⑻ appears as 
an algebraic equation of degree n for a, where A is an n x n matrix. The left-hand 
side of (8) is called the characteristic polynomial of the matrix A and is denoted as 
Pa- 




/3-a 2 \ / 

dei(A — nl} 二 det 1 — (3 — £1)(4 一 t7) — 2 

V 1 4-a / 

—— la + 10=0. 


This equation has two roots, 

<i\ = 2, £12 = 5. 


These are eigenvalues; ihere is a【i eigen vector corresponding to eiich: 


(A-ail)Aj = ( ! Dh, =0 


is saiisfied by 


h 


(-0 


and of course by any scalar multiple of h\. Similarly, 


(A — H 2 I) A: 


/r, = 0 


is satisfied by 




and of course by any mulliple of A 2 - 

The vectors h\ and hj are not multiples of each other, so they are linearly 
independent. Thus any vector h in 15 can be expressed as a linear combination of A 】 
and / 12 ： 


h — b\h\ + ^ 2 ^ 2 * 


⑼ 
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We apply to (9) and use relation (6)~, 

A s h hia^hi 4- / ， 2. (9) N 

Since a\ — l.ai = 5, both = 2' v and = 5 A tend to infinity; since h\ and /?2 are 
linearly independeni, it follows (hat also A、/r tends lo inlioily, unless both h\ arid b 2 
are zero, in which case, by (9), h = 0. Thus we have shown that for A = ( J j) and 
any h # 0, A 、 oo as A/ ^ oo; that is, each component tends to infinily. This 
bears out our numerical resull in case (a). In lacl. A 、 〜 5』\ also borne out by the 
calculations* 

Example 2* Here is a more interesting case. The Fibonacci sequence/ q* " 

is defined by the recurrence relation 

/ii+1 ^fn + fn -1 j (10) 

with the starting data /o = 0, /i = K The first ten terms of the sequence are 

0,1, ],2,3,3,8,13,21,34; 

they seem to be growing rapidly. We shall construct a formula for/" that displays its 
rale of growth. We slart by rewriting the recurrence relation (10) in matrix-vector 
form: 



Note that a\ is positive and greater than I, whereas is negative and in absolute 
value much smaller than L 
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=Cj/i] 4 - Cjh! - 


Comparing the firsl component shows that ^2 — —Hie second component yields 
Ci — I/\/5, So 

(ji) = 75 hl ~i5 h2 - 

Set this into (11); wc get 




The first component of this vector equation is 

fn 二 d[jy/5^ 4/VS. 

Since o^J^/S is less than 1/2、 and since f n is an integer, we can put this relation in 
the following form: 

f t} = nearest integer to —^ 


Exercise i. Calculate 

We return now to the general case (6), (8)* The charaderisfiv polynomial of the 
matrix A, 


det(cl - A) = p\[d). 


The eigenvectors satisfy the equations 



These equations arc easily solved by looking at (lie first component: 



of course any scalar multiples of them are eigenvectors as well. 

Next we e\pi*ess the initial vector (/b, /i) J — (0, I) 1 as a linear combinaiion of 
the eigenvectors: 



is a polynomial of degree n: the coeftident of the highest power a 11 is 1, 



64 


LINEAR AI.C1F.BRA AND ITS APPLICATIONS 


According to the fundamental theorem of algebra, a polynomial or degree n with 
complex coefficients has n complex roots; some of the roots may be multiple. The 
roots of the characteristic polynomial are the eigenvalues of A. To make sure that 
these polynomials have a full sel of roots, the spectral theory oi linear maps h 
formulated in linear spaces over the field of complex numbers. 


Theorem 1* Eigenveclors of a matrix A corresponding to distinct eigenvalues 
are linearly independent. 


Proof. Suppose Oi / for i ^ k and 

Ahi = aih h 乂一 0 + ( 12 ) 

Suppose now that there were a nontrivial linear relation among the h it There may be 
sevenil; since all h ； / 0, all involve ai least two eigenveclors. Among them there is 
one which involves the least number m of eigenvectors: 

m 

bjhj — 0 f h — 0. = 1 ， … ， "1; (13) 

i 

here we have renumbered the A/. Apply A to (13) and use (12); we get 

E ft / AA / = E fo ^ =a ⑽ 

Multiply (13) by a fft and subtract from (13V ： 

m 

—¥„}% = (). { 13 )' 

i 

Clearly the coefficient of h m is zero and none of the others is zero, so wc have a linear 
relation among the hj involving only rn — 1 of the vectors, contrary to m being ihe 
smallest mimber of veclors satisfying such a rela.tion. □ 

Using Theorem 1 we deduce Theorem 2* 

Theorem 2. If the characlerisiic polynomial of the n x n matrix A has n distinci 
roots, ihen A has n linearly independent eigenvectors* 

In this case the n eigen veclors l.orm a basis; therefore every vector h in C f, can be 
expressed as a linear combination of the eigenvectors: 

n 


( 14 ) 
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Applying A iV to (13) and using (6) jV we get 

A^h = h^hj. (14/ 

This formula can be u.sed to answer the stability quesiion raised at the beginning of 
this chapter: 

Exlrc isk 2, ⑻ Prove that if A has n distinct eigenvalues cij and all of them are 
less than one in absolute value, then all h ill C' 

A^h —r 0 as N —^ oc, 

that is, all components of A' ft tend to zero, 

(b) Prove that if all aj arc greater than one in absolute value, then for all h ^ 0 t 

A jV /i — oo as Af — ， oo s 

that is* some components of A V A fend to infinity- 

There are two simple and useful relations between the eigenvalues of A and the 
matrix A itself. 


Theorem 3, Denote by a *， …，％ the eigenvalues of A + with the same 
multiplicity they have as roots of the charactcrislic equation of A. Then 

Ui = tr A, JJ a f = det A. (! 5) 

Proof. We claim that the chiinicteristic polynomial of A has the form 

p^(s) ^ s ft — (trA)’ _1 + … + (—])^ det A. (15/ 

According to elementary algebra, (he polynomial p\ can be factored as 

it 

Pa(s) = Yl(s - ai); ( 16 ) 

j 

this shows that the coefficient of s tl 1 in p A is — ^ a,, and the consiant term is 
(—I f 11,. Comparing this with (15/ gives (15). 

To prove (15/, we use first formula (16) in Chapter 5 for the determinant as a sum 
of products: 

“ 一 Cf|_ •… 峰 “l/i 、 

-^3i «22 

Pa ( 篡）二 del (si - A ) = det . b 

\ ~^n\ … s — n m / 

— E 咖 n - 叫") ， 
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Clearly the terms of degree n and n - 1 in s come from the single product of (he 
diagonal elements* 

J]( S - 吨 ）=/ 一 （ trA )/— 1 + " 

This identifies the terms of order n and (^ - 1) in (15). The term of order zero, 
/?a( 0), is det (-A) — (—1)" del A. This proves (15/ and completes the proof of 
Theorem 3. □ 

Exercise 3 , (a) Verify for the matrices discussed 10 Examples 1 and 2, 

and 

that the sum of the eigenvalues equals the trace, and their product is the determinant 
of the matrix. 

Relation ( 6 )^, A n 7i = shows that if a is an eigenvalue of A. a N is an 
eigenvalue of A a + Now lei q be any polynomial: 

Multiplying (6)^ by qy and summing we get 

^/{A)/i 二穿⑻ A. (17) 

The following result is called the spectral mapping theorem. 

Theorem 4. (a) Let q any polynomial, A a square matrix, a an eigenvalue of 

A. Then q{a) is an eigenvalue of <y(A)- 

(b) Every eigenvalue of ^(A) is of the form q{a). where a is an eigenvalue of A + 

Proof. Part (a) is merely a verbalization of relation (17), which shows also that A 
and ^(A) have h as common eigenvector 

To prove (b), let h denote an eigenvalue of q(A): that means that ^(A) — bl is not 
inverlible. Now factor the polynomial q(s) — h: 

q(s) ^ h ^ cY[(s - n). 

We may set A in place of s: 




疗 (A) - hi 二 e JJ(A -r/I). 
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By taking b to be an eigenvalue of t/(A) ? the left-hand side is not invertible. 
Therefore ncilhcr is the righl-hand side. Since the righi-hmid side is o product, it 
follows that at least one of the factors A — r,I is not invertible. That means that some 
ri is an eigenvalue of A- Since ry is a root of q(s) — h, 

<iin) = h. 

This completes the proof of part (b). □ 

If in particular we (ake q to be the characteristic polynomial p\ of A, we conclude 
that all eigenvalues of pa(^) are zero. In fact a little more h true, 

Theorem 5 (Cayley-Hamilton), Every matrix A satisfies its own characteristic 
equation: 


p A {A) = 0 


(18) 


Proof. If A has distince eigenvalues, then according to Theorem 2 ii htis n linearly 
independent eigenvectors hjj = 1” "，Using (4) we apply p m \(A): 

pA(A)h - ^ pA(Oj)bjhj - 0-0 

for all A, proving (18) in this case. For a proof that holds for all matrices we use the 
following lemma* 


Lemma 6* Let P and Q be two polynomials with matrix coefficients 


= Q (和 


The product PQ = R is then 

R ㈤ 二 f R〆 




j\k-! 


Suppose that the matrix A commutes with the coefficients of Q; then 


P(A)Q(A) = R(A). 


m 


The proof is self-evident. 

We apply Lemma 6 to Q(s) = — A and P(^) defined as the matrix of cofactors 

of Q(s); that is. 


— (一】 )’ ⑻ } 


(20) 
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l)ij I he determinant of the ijih minor of Q(s). According to the formula (30) of 
Chapter 5, 

P (‘猶卜 dclQOOIt/^I ， (21) 

whore pa(s) is the characteristic polynomial of A, A commutes with the coefficients 
of Q; therefore by Lemma 6 we may set s ^ A in (21). Since Q(A) = 0, it follows 
that 

Pa(A) = 0, 

This proves Theorem 5. □ 

We are now ready to investigate matrices whose characteristic equation has 
imihiple roots. First a few examples- 

Example 3. A = h 

Pa(s) = det(U)=( 卜 1) n ; 

1 is an n-fold zero. In this case every nonzero vector h is an eigenvector of A. 


Example 4. A — (;\ *,), tr A = 2, det A = 1; therefore by Theorem 3, 

Pa(s) - s 2 - 2s+ I ， 


whose roots are one, with multiplicity two, The equation 


Ah = 


( 3fn + 2h 2 \ = (hA 

y-ih^h.J ~ \h 2 ) 


has as solution all vectors h whose components satisfy 

A 】+ ,!2 = 0 . 

All these are multiples of A = ). So in Ihts case A does riol have two independent 

eigenveclars* 


We claim that if A has only one eigenvalue a and n linearly independent 
eigenvt!dors, then A « al. For in this case every viictor in R rt can be written iis in 
(14), a linear combination of eigenvectors. Applying A lo (14) and using a- t = a for 
I = 1 ” ” ， h gives that 


\h = ah 

for all /i; then A = al. We further note that every 2x2 matrix A with 
tr A = 2, det A = 1 has 1 as a double mot of its characteristic equation. These 
malrices form a two-parameter family; only one member of this family，A = I, has 
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iwo linearly independent eigenvectors. This shows that，in general, when the 
characteristic equation of A has multiple roots, we cannot expect A lo have n linearly 
independent eigenvectors. 

To make up for this defect one turns to generalized eigenvectors. In the first 
instance a generalized eigenvector /is defined as satisfying 

(A - alff = 0, (22) 

We show first that these behave almost as simply under applications of A v as the 
genuine eigenvectors* Wc set 

(A - al)f - lu 

Applying (A — al) to this and using (22)，we get 

(A — al)/i = 0’ 

that is 9 A is a genuine eigenvector. We rewrite (23) and (23/ as 

A/ — af + h. Aft ; ah ， (24) 

Applying A lo ihe lirst equation of (24) and using the second equation gives 

A 2 / = A/ 十 AA = + 2ah. 

Repealing ihis N times gives 

A’= a iV / + M/- 、 

Exercise 4 , Verify (25) by induction on N. 

Exercise 5 , Prove that lor any polynomial q. 

q 、 A)f = q[a)f + q r {a)h, 

where q f is the derivative of q and / satisfies (22), 

Formula (25) shows that if | 沒 | < 1， and / is a genenili/ed eigenvector of A t 
A:、/ ~^ 0_ 

We now generalize the notion of a generalized eigenvector 

Definition^ /is a generalized eigenvector ol' A, with eigenvalue a, if/ ^ 0 and 

(A - a\) m f = 0 (27) 

for some positive integer m. 

We state now one of the principal results of linear algebra. 


(25) 


(26) 


(23) 
(23 / 
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Theorem 7 (Spectral Theorem), Let A be an n x n matrix with complex 
entries. Every vector in V 1 can he wriuen as a sum of eigenvectors of A, genuine or 
generalized* 


For the proof, we need the following results of algebra. 


Lemma 8. Lei p and g be a pair of polynomials with complex coefficients and 
assume that p and q have no common zero* Then there are two other polynomials a 
and b such that 


ap+ bq = 1, (28) 

Proof. Denote by 3 - all polynomials of the form ap + bq. Among them there is 
one, nonzero* of lowest degree; call it d. We claim that d divides both p and q; for 
suppose not; then the division algorithm yields a remainder r, nay 

r = p — nut 

Since p and d belong to so does p — nuf — r; since r has lower degree lhan rf. this 
is a contradiction. 

Wc claim that d degree zero; for if it hi\d degree grc；Kcr than zero, it would, by 
the fundumemal theorem of algebra, have a root. Since J divides/? and ry, ihh would 
be a common rool of p and q. Since we have assumed the contrary, deg d — 0 
follows; since _ 0 t d s const*, say 三 1, This proves (28). □ 

Lemma 9* Let p and q be as io Lemma 8, and let A be a square matrix with 
complex entries. Denote by N f> , N (r and N pq Ihe null spaces of /j(A), ^(A), and 
^(A)^(A), respectively. Then N m is the direct sum of N p and N q : 

N pq = N p eN (n (29) 

by which we mean tha( every x in N pq can be decomposed uniquely as 

x^Xp+x q , Xpin N p ' in N r (29)' 


Proof. Wc replace the argument of the polynomials in (2S) by A; wc get 


a(A)p(A) + 办 (A)q( A) = L (30) 


Letting both sides act on x wc obtain 


a(A)p(A)x + b(A)q{A)x — x, 


( 31 ) 
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We claim that iLt belongs to N pir then the first term on the left in (31) is in N qT and 
the second in N p . To see this we use the commutativity of polynomials of the same 
matrix: 


分 (A)“(A)"(A> = a(A)t>(A)q(A)x — 0, 


since x belongs to the nullspacc of p(A)q(A). This proves that the first term on the 
left in (31) belongs to the nullspace of q(A): analogously the second lerro belongs to 
the nullspace of p(A). This shows that (31) gives the desired decomposition (29)\ 
To show that the decomposition is unique, we argue as follows: If 

.r = x p + .i^ = ^ 々 

then 

)? = - ^ X f q - X q 

is an element that belongs Lo bolh N p and N ir Let (30) uct on y: 

ti(A)p(A)y + b(A)q(A)y = y. 

Both terms on the left-hand side are zero; therefore so is the right-hand side, y. This 
proves that x p — □ 

Corollary 10, Let … ,/々 be a collection of polynomials that are pairwise 
without a common zero, Denote the nullspace or the product pi(A) “ ， Pk(^) by 
N PlPkm Then 

Npr “ p k = Np 、© ’ ’ ■ ® ■ ( 32 ) 

ExiiKCise 6. Prove (32) by induction on k. 

Proof of Theorem Z Let x be any vector; the n + I vectors x, Ax, A 2 y, , " A n x 
musl be linearly dependent; therefore there is a polynomial p of degree less than or 
equal to n such that 

p ( A)x = 0 ( 33 ) 

We factor p and rewrite this ns 


U(A » = (1. 


(33)， 
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rj the roots of their multiplicity. When rj is not an eigenvalue of A, A - i)I is 
invcniblc; since the factors in O^Y commute, all invertible faciors can he removed* 
The remaining in (33)’ are all eigenvalues of A* Denote 

PM = (s - rjp; (34) 

then (33 / can be written as FT /?j(A).v — 0, that is, x belongs (o A^ r ,. PR+ Clearly the pj 
pairwise have no coinnion zero, so Corollary 10 applies: jc can be decomposed as a 
sum of veckirs in N r . But by (34) and Definition (27), every xj in N p . is a generalized 
eigenvector. Thus we have a decomposition of : as li sum of generalized 
eigenvectors, as asserted in Theorem 7. □ 


We have shown earlier in Theorem 5, the Cayley-Hamilionian Theorem, that the 
characteristic polynomial p\ of A satisfies /7 a (A) =0, We denote by 3=3a the set 
of all polynomials p which satisfy p(A) ^ 0. Clearly, the sum of hvo polynomials in 
5 belongs lo S s ; furthermore, if p belongs to so does every multiple of p. Denote 
by m = m A a nonzero polynomial of smallesi degree in we claim that all p in $ 
arc multiples of m. Because, if not, then the division process 


p = qm — r 

gives a remainder r of lower degree than m. Clearly, r = p - ipn belongs to J, 
contrary to the assumption that m is one of lowest degree. Except for a constant 
factor, which we fix so that the leading coefficient of m A is \ ,m = m m \ is unique. This 
polynomial is called the niininml polynomial of A, 

To describe precisely tlie minimal polynomial we return \o the definition (27) of a 
generalized eigen vector* We denote by N,„ = /V 川 (a) itie nuUspace nf (A — 

The subspaces N m consist of generalized eigenvectors; they are indexed 
increasingly, lhat is t 


N } c 处 C …. 


(35) 


Since these are subspaces of a fiiiite-dimensional space, they must be equal from a 
certain index on, Wc denote by J = d(a} the smallest such index, that is, 

Ki — … ( 35 )' 

bul 

# N d : (35f 


d{a) is called the index of the eigenvalue a* 
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Exercise 7, Show that A maps into itself. 

Theorem 11, Let \ be sm n x n matrix: denote its distinct eigenvalues by 
a\,,,.. a “ and denote the index of aj by dj. We claim that Ihe minimal polynomial 
m £ \ is 

w ? a ($) = [J (卜叫)气 
] 

Exercish 8, Prove Theorem II, 

Lei us denote N^(aj) by 神 ; then Theorem 7, the spectral iheorem, can be 
formulated as follows: 

C n = N {[) ® M 2) ® … e N m . (36) 

The dimension of N^ f equals the muhiplieiiy of aj as (he root of the characteristic 
equation of A- Since our proof of this proposition uses calculus, we postpone it until 
Theorem 11 of Chapter 9 t 

A maps each subspace into itself; such subspaces are called invariant under 
A. We turn now to studying the action of A on each subspace; this action is 
completely described by the dimensions of N'，Nh …， in the following sense. 

Theorem 12. (i) Suppose the pair of matrices A and B are similar in ihe sense 
explained iti Chapier 5 [see equation (37)] ? 

A = SBS' (37) 

S some invertible matrix. Then A and B have the same eigenvalues: 

^ j ― ^ j „. - * ^ \ 138 j 

furthermore，the nollspaces 

N m (aj) — mtUspacc of (A — ajl) … 
and 

MfMj)= : null space of (B — 
have for all j and m the same dimensions: 

dim N m (aj) ~ dim (39) 
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(ii) Conversely, if A and B have the same eigenvalues, and if condition (39) about 
the nullspaces having the same dimension is satisfied, then A and B are similar. 

Proof. Piirt (i) is obvious: for if A and B are similar, so arc A — al and B — aI, 
and so is any power of them: 


(A - olf* = S(B -"I) W S (40) 

Since S is a 1 -ro-1 mapping* the nullspaces of two similar matrices have the same 
dimension. Relations (39) and in particular (38)，follow from the observation. 

The converse proposition will be proved in Appendix 15. 

Theorems 4, 7, and 12 ate the basic facts of the speciral theory of marrices. We 
wish to point out that the concepts that enter these iheorems — eigenvalue, 
eigesivcctor, generalized eigenvector, index — remain meaningful for any mapping 
A of any finite dimensional linear space X over C into itself. The three theorems 
remain true in this abstract context and so do the proofs. 

The uscfulocss of spectral theory ia ao abstract setting is shown in the following 
important generalization of Theorem 7* 

Theorem 14. Denote by X a finite-dimensional linear space over the complex 
numbers, by A and B linear maps of X into itself, which commute; 


AB - BA. (41) 

Then ihere is a basis in X which consists of eigenvectors and generalized 
eigenveclors of both A and B. 


Proof. According lo the Spectnil Theorem, Theorem 7, equation (36), X can be 
decomposed as a direct sum of genemlized cigenspaces of A: 

the nollspace of (A - ajl) dt . Wc claim that B maps into for B is 
assumed to commute with A. and ihere fore commules with (A — al) J : 

B( A — al) d x = (A — al^Bx. (42) 

If a is an eigenvalue and t belongs to the left-hand side of (42) is 0; therefore so 
is the right-hand side, which proves that B.v is in N^K Now we apply the Spectral 
Theorem to the linear mapping B acting on 祕、 aod obtain a spectral decomposition 
of each with respect to B, This pmws Theorem 14. □ 

Gorollary 15. Theorem 14 remains iruc if A, B are replaced by any number of 
pairwise commuting linear maps. 
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Exercise 9 . Prove Corollary 15. 

In Chapter 3 we defined the transpose of a linear map. When A is a matrix, that 
is, a map ^ C 11 , its transpose A r is obtained by interchanging the rows and 
columns of A. 

Theorem 16. Every square matrix A is similar to its transpose A T . 

Proof We have shown in Chapter 3, Theorem 6 , that a mapping A of a space X 
into itself, and its transpose A f mapping X r into itself, have nullspaccs of the same 
dimension. Since the transpose of A — <7l is A r - aV it follows that A and A f have the 
same eigenvalues, and that their eigenspaces have the same dimension, 

The transpose of (A — aVf is (A’ 一 al’/; therefore their nullspaces have the same 
dimension. We can now appeal to Theorem 12 mid conclude that A and A' 
interpreted as matrices, are similar □ 

Theoreoi 17• Let X be a finirc-dimcnsional linear space over C f A a linear 
mapping of X into X, Denote by X f the dual of X, P^:X f X f the transpose of A, Let 
a and b denote iwo distinct eigenvalues of A: a ^ b, x an eigenvector of A with 
eigenvalue a, I an eigenvector of A ; with eigenvalue b. Then / and 义 annihilate each 
other: 

(/ 4 )= 0 . ( 43 ) 

Proof 、 The transpose of A is defined io equation (9) of Chapter 3 by requiring 
that for every jr in X and every / in X f 

(A’/,jt) = (/ ， Aj:)- 

If in particular we take x to be an eigenvector of A and / to be an eigenvector of A' 

A.r = ojc 7 A f l = bt, 

and we deduce that 

b (^ x ) = 。(/，外 

Since wc have taken a 參 b 、 ( 人 x) must be zero, O 

Theorem 17 h useful in calculating and studying the properties of expansions of 
vectors x in Icons of cicenvcctors. 
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Theorem 18, Suppose the mapping A has n distinct eigenvalues 
Denote the corresponding eigenvectors of A by xi } those of A' by / 卜 … . ， 4- 
Then 

( a ) (/ f ,^ f ) # 0,( = 1，…具 

(b) Let 

x =Yl k J x J' ( 44 ) 

be the expansion of x as a sum of eigenvectors; then 

= (Ji i / (/n? 丨 = 1 t » » ， ，"* (45) 

Hxercish io. Prove Theorem 18. 


Exercise i t * Take the malrix 

0 1 
1 1 

from equation ( 10 )’ of Example 2, 

(a) Determine the eigenvector oi its transpose. 

(b) Use fornuilLis (44) and (45) to determine the expansion of ihe vector (0, 1 / 
in terms of ihe eigenvectors of the original matrix. Show that your answer 
agrees with the expansion obtained in Example 2. 





has 1 as an eigenvalue. What arc the other two eigenvalues? 



CHAPTER 7 


Euclidean Structure 


In this chapter we abstract the concept of Euclidean distance* Wc gain no greater 
generality: we gain simplicity, transparency and flexibility. 

We review the basic structure of Euclidean spaces. Wc choose a point 0 as origin 
in real h- dimensional Euclidean space; (he length of any vector x in space, denoted 
as II jr II，is defined as its distance to the origin. 

Let us introduce a Cartesian coordinate system and denote the Cartesian 
coordinates of x as ,V| ， * *. By repeated use oT the Pylhagorean theorem we am 
express the length of .v in terms of its Carlesian coordinates. 


IU ||= + … (I) 

The scalar product of two vectors x and y. denoted as (x f y). is defined by 

(^y) = E W ⑵ 


Clearly, the two concepts are related; we can express the length of a vector as 


[I: x [| 2 = (x,x). 


The scalar product is commutative: 


(x, y) = (ya) 

(3) 

and bilinear: 


(x+u,y) = (x ， y) + (a ， y )， 

(x,y+v) = (x ， y) + {; 略 

(3) J 


Linear At^ebm and fo Appticatmis, Second Editimu by Peter D. Lax 
Copyright i 2007 John Wiley & Sons, Inc. 
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Using these algebraic properties of scalar product we can derive the idemily 

(x-y,x-y) = (x,x) - 2(x,y) + (y,y)- 
Using (2)\ we can rewrite this identity as 

II 〜卜 II 刈 I 2 - 2 ㈣ + |U|| 2 . (4) 

The term on the left is the distance of x from v ， squared; the first and third terms 
on the right are the distances of x and y from 0, squared. These three quantities 
have geometric meaning; therefore they have the same value in any Cartesian 
coordinate system. If follows therefore from (4) that also the scalar product (2) has 
the same value in all Cartesiiin coordinate systems. By choosing special coordinate 
axes, the first one through x, the second so thaty is contained in the plane spanned by 
the first two axes, we can uncover the geometric meaning of y). 


y 


x 

The coordinates of the vector x and y in this coordinate system are 
a + = (|| .v ||.0 … （ 1) and Y = (|| y || cos(9 …广 Therefore 

(x,y) HI j || €Q%0, (5) 

0 the angle between x and y. 

The three points 0, x f y fcnn a triangle whose sides tire a = || jt ||, /? = || y ||, 
c = || x — y || T (brming iin angle d at 0: 



o 



Relations (4) and (5) can be written as 

c 2 = a 1 + b 1 - lab cos0. 


(4 y 
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This h the classical law (tfcmine; a special case of it, 0 — n/2. is the Pythagorean 
theorem* 

Most texts derive formula (5) for Ihe scalar product from the law of cosine* This is 
a pedagogiccil blunder, tor most students have long forgotten the law ol" cosine, if 
they ever knew it. 

We shall give now an abstract, that is axiomatic, definition of Euclidean space, 

DefinUion, A Euclidean siruciure in a linear ?ipace X over the reals is furnished 
by a real-valued t'unciion of two vector arguments called a scalar product and 
denoted as (x f y), which has ihe following properties: 

(i) (x t y) is a bilinear function; that is, U is a linear function of each argument 
when the other is kept fixed, 

(ii) It is symmetric: 

(x，>_) = (y,x). (6) 

(iii) H is positive: 

(x,x) > 0 except for x — 0. (7) 

Note thal the scalar product (2) satisfies ihese axioms. We shall show now thai, 
conversely, all of Euclidean geometry is contained in these simple axioms, 

We define the Euclidean length (also called norm) of x by 

II -V II ^ (x,x) l/2 . ⑻ 

A scalar product is also called an inner praduci, or a dot producL 

DefimtioiL The distance of two vectors jr and v io a linear space with Euclidean 
norm is defined as | ^ j j|. 

Theorem I (Schwarz Inequality) 4 For all x t v. 

Proof Consider the function q(r) of the real variable f defined by 

9(0=IU + d ， ll 2 . (io) 

Using the definition (8) and properties (i) and (ii) we can write 


洲 =|| x I] 2 +2t{x.y) +t 2 || v 


(10/ 
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Assume that j / 0 and m t — -(r,j}/|| y || 2 in (10)' Since (10) shows that 
q{t) > 0 for all t, wc get that 


H 2 - 鹆 

This proves (9), For y — () ? (9) is irivially true. 


>0 


□ 


Note thal for the concrete scalar product (2). inequality (9) follows from the 
representation (5) of (a; v) as |[ x ||| ^ || cos 6. 


Theorem 2 


IU 11= max(^>^ILv IH L (11) 

Exercise k Prove Theorem 2. 

Theorem 3 (IViangle Iiiequality), For all w y 

IU + y||slM| + Lv|. (12) 


Proof. Using the algebraic properties of scalar product，wc derive, analogously to 
(4), the identity 

I 卜 +y|| 2 =IM 2 + 2(u)+||y|| 2 (12/ 

and estimate the middle term by the Schwarz ioequaliLy. □ 

Motivated by (5) we make the following deti nil ions. 

Definition. Two vectors a ： and y are called orthogonal {perpendicular)^ denoted 
as x 丄 y ， if 

(x ? ^) — 0 . (13) 

From (12 / wc deduce the Pythagorian theorem 

II ⑴ ll 2 HMI 2 + IU " 2 心丄， ⑽ 

Definition. Let X be a finite-dimensional linear space with a Eulerian structure, 
x( l > t … a basis for X. This basis is called orthonomwl with respect lo a given 
Euclidean structure if 




( 14 ) 
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Theorem 4 (Gram-Schmidt), Given an arbitrary basis y ⑴，… in a finite- 
dimensional linear space equipped wilh a Eudide[m structure* there is a rdalcd basis 
¥】），". ，- with the following properties: 

(i) , * ” r ㈤ is an orlhonormal basis. 

(it) jr ⑷ is a linear combination of 1 ,... ^y ik K for all L 

Proof. We proceed recursively; suppose have already been 

conslrocted. We set 



Since " ” are already orthonormal, it is easy to see that x^ k) defined above 
is orthogonal lo them if we choose 


Finally we choose c 


Cf = (y ik} ,x {!] ), 
m that || a j || = I, 


l = l,... t k — 1 , 

□ 


Theorem 4 guarantees the existence of plenty of orthonormal bases. Given such a 
basis, any a cun be written as 


-V 


t ¥ W - 


(15) 


Take the scalar product of (15) with .v (’)； using the orthonomiality relations (J4) we 
get 


Lei y be any other vector in X\ it can be expressed as 

y = ⑻. 


(16) 


Take the scalar product of y with a% using the expression (15), Then, using (14), we 
get 

(X, y) = D a J b ^( xiJ} , #) = 响 - ( 17 ) 

In particular, for v ― ^ we get 

imi 2 =E^ ( 17 )' 
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Equation (17) shows that the mapping defined by (16), 

x —> I , i •,, 

carries the space X with a Euclidean structure into and carries the scalar product 
of X into the standard scalar product (2) of R' 

Since the scalar product is bilinear, tor y fixed (,v ， _v) is a linear function of 又 
Conversely, we have the following theorem. 

Theorem 5. Every linear function I{x) on a finile-dinienHional linear space X 
wilh Euclidean structure can be wrilten in Ihe form 

恥 XV )， (18) 

y some element of 

Proof Imroduce an onhonormal basis 尤 ⑴， .. ， in X; denote the vakie of / on 
x {k) by 

l{^ k) ) = b k . 

Set 

y = (19) 

It follows Irom onhonormality that — bi. This shows that (18) holds for 

x — x^ : \k — l ? 2 ,,,.,but if two linear functions have the same value for all 
vectors that form a basis, they have the same value for all vectors □ 

Corollary S\ The mapping / —^ y is an isomorphism of ihc Euclidean space X 
with its dual. 

Definition. Let X be a finite-dimensional linear space with Euclidean structure, 
Y a subspacc of X. The orthogonal complement of K denoted as consists of all 
vectors s in X that arc orthogonal tn every v in Y: 

z in K 1 if (y, z) — 0 for all y in Y. 

Recall lhat in Chapter 2 we denoted by Y 1 the set of linear functionals ihni vanish 
on K The notation Y [ inlroduccd above is consistent with the previous noiation when 
the dual of X is identified with X via (18), In particular, is a subspace of X. 

Theorem (5* For any subspacc YofX, 


a: = y © 


po) 
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The meaning of (20) is that every xmX can be decomposed uniquely as 

$ = > 十） r 1 ， y in orthogotut! to Y. (20) y 

Proof. Wc show first that a decomposition of form (20/ is unique. Suppose wc 
could write 

x — z + r 1 * Z in Y\ z 1 in f L . 

Comparing this with (20)’ gives 

y — z = r 1 " — ： y 丄 . 

It follows from this that v — s belongs both to Y and to F 丄， and thus is orthogonal to 
iiself: 

()=(y — z ， z 1 - /) y-z) =||.y-z||% 

but by positivity of norm, y — z~ 0. 

To prove that a decomposition of farm (20)^ is always possible, we conslrucl an 
orthonormal basis of X whose first 众 members lie in Y\ the rest must lie in Y 1 . We can 
construct such a basis by starting with an orthonormal basis in K then complete it to a 
basis In X, and then orthonornializc the rest of the basis by the procedure described 
in Theorem 3, Then x can be decomposed as in (I5X We break this decomposition 
into two pans: 

n k ft 

x = J2 a J xV) = 5Z + E = y+y 1： ^ ( 21 ) 

i i i+i 

dearly, y lias in Y and y 1 in Y' □ 

In the decomposition (20)' the component y is called the onhogonalprojeaion of 
x into K denoted by 

y = P#. (22) 

Theorem 7* (i) The mapping P> is linear. 

(ii)P 2 y=Pr. 

Proof Let w be any vector in X, unrelated to and let its decomposition (20/ be 

w — z + Z in F.z 1 in 


Adding this to (20)' gives 


x + vr = (y + 4 + () ， 丄十产 ）‘ 
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the decomposition of x + w\ This shows that ?y(x + w) = P Y x + Pyih Similarly， 
Py{/：.V) — 

To show that = P^, we take any x and decompose it as in (20/; x = y + 

The vector y = P v needs no further deconiposilion: P^y = y. □ 

Theorem 8 + Let Kbe a linear subspace of the Euclidean space X, x some vector 
in X. Then among all elements z of K the one closest in Euclidean distance to x h 
PyX, 

Proof. Using the decomposition (20)' of a we have 

x~ Z -y~z + : y 丄， y- 

Since y and z both belong to Y, so does y — z* 

Therefore by the Pythagorean theorem (13广 

II H II 2 II 2 +11/II 2 ; 

clearly this is smallest when z — y* Since the distance between two vectors z is 
(I .t — z 11, this proves Theorem 8. □ 

We turn now to linear mappings of a Euclidean space X into another Euclidean 
space U. Since a Euclidean space can be identified in a nalural way with its own 
dual, ihc transpose of a linear map A of such a space X imo U maps U into X. To 
indicate this distinclion, and for yet another reason explained at the end of this 
chapter, the transpose of a nitip A of EuclideLin X into U is called the adjoint of A and 
is denoted by A' 

Here is the full definition of the adjoint of a linear mapping A of a Euclidean 
space X into another Euclidean space U: 

Given any // in U, 

/(^) = (Axji) 

is a linear function of x. According lo Theorem 5, ihis linear function l(x) can be 
represented as (x^ y), y in X. Therefore for all a in X 

Ui v) ^ (Ax, 14 ). (23) 

The vectory depends on h; Since scalar products arc bilinear, y depends linearly on 
u\ we denote this dependence as y — A*w, and rewrite (23) as 

(x, A*w) 」 （ Ajt ，“). （ 23)’ 

Note that A maps U into X\ the parentheses on \hc lcfl denole the scalar produci in 
X ， while those on the right denote the scalar product in (/. 

The next theorem lists the basic properties of adjoimness: 
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Theorem 9, (i) If A and B are linear mappings of X into U, then 

(A+ B)，= A* + B' 

(ii) ff A is a linear map of X into fA while C Is a linear tnap of U into K then 

(CA)* = AX' 

(Hi) If A is a Uto-l mapping of X onto £/, then 

(A- i r = (^) \ 

(iv) (A + r = A. 

Proof (i) is an immediate consequence of (23 )’： (ii) can be demonstrated in two 
steps: 

(CAx.v) ^ ( 九 v ， CV )= (晃 A*C*v). 

(iii) follows from (ii) applied to A -1 A = /, / the identity mapping, and the 
observation that I* = L (iv) follows if wc use the symmetry of the scalar product to 
rewrite {23/ as 

(«, Ajc) = (A 、， jc). □ 

When we lake X to be R 11 and U lo be U m wiih their standard Euclidean structures, 
and interpret A and A as matrices, they are transposes of each other. 

We present now an important application of the notion of the adjoint. 

There arc many silimtions where quantities cannot be measured 

directly, but certain linear combinations of them, 

“lA + … + a n x n ^. 

can* Suppose that n such linear combinations have been measured We am put all 
this information in the form of a matrix equation 

^ = Pf (24) 

where p f1i are the measured values，and A is an m x n matrix. We shall 

examine the case where the number m of measurements exceeds the number n of 
quantities whose value is of interest to us* Such a system of equations is 
ovevdetermined and in general iloes not have a solulion. This is not as alarming as it 
sounds, because no measuremeni is perfect, and iherefore none of the equations is 
expected to hold exactly, In such a situation, we seek that vector thai conies closest 
to satisfying all the equations in the sense that makes || Aa 」 一 p || 2 as small as 
possible* 
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For such an jc to be determined uniquely, A cannot have nonzero null vectors. For 
if Ay = () t and a [i ininimizer of || Ajc — p ||, then so is x + ky, k any number. 

Theorem 10. Lei A be an m x n mtilrix, m > n 7 and suppose that A has only ihc 
trivial null vector 0, The vector x that minimizes | A,v - p |p is the solution z of 

A" = A */ 入 (25) 

Proof. We show firsi thar equation (25) has a unique solution. Since the range of 
A is (25) is a system of n equations for n unknowns. According to Corollary B 
in Chapter 3, ii unique solotion is guamnteed if the homogeneous equation 

A*Ay ^ 0 (25)' 

has only the trivial solution y = D. To see that Ihis is the case, take the scalar proditcl 
of (25/ wilh '' Wc get, using the definition (23/ of adjointness, 0 = (A’Avj)= 
(Aj t Ay) =|| Av |p. Since || || is positive, it follows that Ay ^ 0, Since wc have 
assumed that A has only the trivial nullspace, v = 0 follows* 

A maps R 11 into an 十 dimensional subspace of U N \ Suppose z is a vector in 
with the following property: 

Az — p m orthogonal to Hie range of A. We claim that such a z minimizes 
I A.v - p ||To see ihis let be any vector in R ”； split it as x — z + yi then 

Aa- - /j = A (z + y) — p = Az — p + Ay. 

By hypothesis kz^ p and Ay are orlhogonal; therefore by the Pythagorean 
theorem, 

II ^ P IP — II Az — p H ： + || Av || 2 , 

this demonstrates the minimizing property of 

To find it we write the condition imposed on z in the form 

(Az — p, Av) = 0 for all y. 

Using the adjoint of A we cm rewrite this its 

(A"(Az - p) ， y) = 0 for all 

The range of A* is IR JI , so for this condition to hold for all v ， A # (Az ― p) must be 0, 
which is equation (25) for z* 口 

Theorem 11, An orthogonal projection Py defined in equation (22) is its own 
adjoint ， 


P ； - Py. 
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Ekhicise 2* Prove Theorem 11, 

We lum now to the following quesiian: what mappings M of a Euclidean space 
into itself preservo the dislanc T e of any pair of points, that is, satisfy for all x t j, 

II M(x) - M{j) || = |U -; H| ? (26) 

Such a mapping is called an hometry. It is obvious IVom Ihe definilioo that the 
composite of iwo isometries is an isometry, An elementary example of an isometry is 
translation: 

M(jc) — + « ? 

a some fixed vector* Given any isomelry, one can compose il wilh a Lranslalion and 
produce an isometry that maps zero to zero. Conversely, any isometry is the 
composite of one that maps zero to zero and a translalion. 

Theorem 12. Lei M be an isometric mapping of a Euclidean space imo itself 
that maps zero to zero: 

M(0) = 0. (27) 


<i) M is linear. 

(ii) MU. ( 28 ) 

Conversely, if (28) is satisfied, M is an isometry, 

(Hi) M Is invertible and its inverse is an isometry* 

(iv) det M = 土 1_ 

Proof It follows from (26) with y = 0 and (27) that 

II M(x)|| = |U||. (29) 

Now let us abbreviate the action of M by ’： 

M(x) — V, M(y) = y. 

By (29) t 

ll^il = lkll ? 11/11 = 114 (29 )， 


By (26), 


: K-y IIHU'V I* 
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Square and use expansion (4) on both sides; 

II y II 2 - 2(v ， y) + ik ii 2 = iu ii 2 - 2 (^)h-ilv|| 2 - 

Using (29/, we conclude that 

(A/) = (x.y): 

that is, M preserves the scalar product. 

Let z be any other vector, ^ — M(z); then, using (4) Twice we get 

ip hk ii 2 + ii y ii 2 + ik ii 2 

一 2(〆 ， y) - 2(〆，/) + 2(/，/)- 

Similarly, 

II If = lu p + h y f + |k "2 一 2(z , 4 一 2(z y) 十 2 权 v) 

Using (29)’ and (30) we deduce lhal 

Il^-/'v1 2 = iu-x'v|| 2 


(30) 


We choose now z = jc+ v; then the right-hand side above is zero; therefore so is 
|| / — y — / || 2 . By positive definiteness of the norm 〆 一 Y — 〆 = 0. This proves 
part (i) of Theorem J2. 

To prove part (ii), we take relation (30) and use the adjointness identity (23/: 

(M\' Mv) = (x, M*Mj) = (x f v) 

for all x and y % so 

(jr,M*My-j) = 0, 

Since this holds for all ,v, it follows that M + Mv — v is orthogonal to itself, and so, hy 
positiveness of norm, that for all y f 


M^My — y — 0, 


The converse follows by reversing the steps; this proves pail (ii). 

It follows from (29) that the nullspace of M consists of the zero vector; it follows 
then from Corollary (By of Chapter 3 ihat M is invertible. That M _l is an isomeiry is 
obvious. This proves (iii). 

It was pointed out in equation (33) of Chapter 5 that for every matrix det M f = 
det M; il follows from (28) and ihe product rule for determinants [see (18) 
In Chapter 51 that (det M) 2 = det I = 1, which implies that 


det M = ±L 


(31) 


This proves part (iv) of Theorem 12. □ 
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The geometric meaning of (iv) is that a mapping that preserves distances also 
preserves volume, 

Defimtion. A matrix that maps H" onto itself isomeLrically is called orthogonal. 

The orthogonal matrices of a given order form a group under matrix 
multiplicalion. Clearly, composites of isometries urc isomelric, and so y by part 
(iii) oT Theorem 12, are their inverses. 

The orthogonal matrices whose deierniinant is plus 1 form a subgroup, called the 
special orthogonal gmup. Examples of orthogonal nialrices with determinant plus 1 
in three-dimensional space are rotations; see Chapter 11 . 

Exfrc：]sh 3 . Construct the matrix representing reflection of points in across 
the plane = 0. Show that the determinant of this matrix is — L 

Exercise 4 . Let R be reflection across any plane in R 3 , 

(0 Show that R is an isometry, 

(ii) Show that R 2 — I 

(iii) Show that IT = R, 

We recall from Chapter 4 that the ijih entry of the matrix product AB is ihe scalar 
product of the /lh row of A with the ylh column of B. The /th row of M is the 

难 

transpose of the fth column of M* Therefore the identity M M = I characterizing 
orthogonal matrices can be formulated as follows: 

Corollary \2\ A matrix M is orthogonal iff its columns arc pairwise orthogonal 
unii vcclors* 

Exercise 5 * Show lhat a matrix M is onhogonal ifl' its rows are pairwise 
orthogonal unit vectors. 

How can we measure the size of a linear mapping A of one Euclidean space X into 
another Euclidean space LH Recall from a rigorous course on the foundations of 
calculus the concept of least upper houmL also called sttpremum. of a bounded set of 
real numbers, abbreviated as su{>. Each component of A_v is a linear function of the 
components of x; || Ajt | xs a quadratic function of the components of x\ and 
therefore the set of numbers | Ax || 2 , || 窵 || 2 = 1 is a bounded set 


Definition 


1 叶昴 ㈤ 1 - 



Note thal || A.v |[ is measured in U, || x || in X. || A | is called the norm of A, 
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Theorem 13, Let A be a linear mapping from the Euclidean space X into the 
Euclidean space U, where |[ A || is its norm* 


(i) 


HI Az II < j| A II II z II for all z in X- 


(33) 



II A ||= sup {Ax.v). 

ll4h“IN: i 



Proof, (i) follows for unit vectors z from the definition (32) of A || + For imy 
z 7 ^ 0, write z — a unit vector; since || Akx j| = || kAiX || - =Nil ^ || and 
|| I = \k\ || x II ， (33) follows. For z — 0 y (33) is obviously true. 

(ii) According to Theorem 2, 


II w II = max (u,v), || v || = ]. 

Set Ajt = « in definition (32), and we obtain (34). □ 

Exercise 6, Show that \aij\ < || A ||* 

Theorem 14. For A as in Theorem 13, we have the following: 

(i) || kA || — |^| || A || for any scalar k. 

(ii) For any pair of linear mappings A and B of X into U 7 

||a + b||<||a|| + !|b|). ( 35 ) 


(ili) Lei A be a linear mapping of X into U, and let C be a linear mapping of V 
into V; then 



II CA i|<||C||l A||. 

(36) 

0 v) 

II A 本 || = || A | 卜 

(37,) 


Proof, (i) follows from the observation ihut || kAx \ = |t| [| Ax ||. 
(ii) By the triangle inequality (12), for all x iri X we obtain 

|| (A-hB)x || = || A.v+B + v||<|| Av|] + | B.v || + 
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The supremum of the left-hand side for || x |j — 1 h [ A + B ||. The right-hand side 
is a sum of two terms; the supremum of the sum is < the sum of the suprema, which 
is || A !| + || B ||. 

(Hi) By inequality (33), 

IICAX ||<!|C!||| A., ||. 

Combined with (33)，this yields 

II CA^||< ||C||||A||||x|!. 

Taking the supremum tbr all unit vectors a gives (36), 

(iv) According to (23/, 

(Ax y v) = A 、)； 

since the scalar product is a symmetric function, we obtain 

(Av ， v) = 

Take the supremum of both sides lor all and v, || x II: 1. || f || = L According to 
(34)* on the left-hand side we gel | A ||, and on ihe right-hand side we obtain 

II A”|.. □ 

The following result is enormously useful: 

Theorem 15, Let A be a linear mapping of a finite-dimensional Euclidean 
space X into ilself that is invertible* Denote by B another linear mapping of X into X 
dose to A in the sense of the following inequality: 

11 A-B||< l/ll A-1 (38) 

Then B is invertible. 

Proof* Denote A — B = C, so that B = A — C. Factor B as 

B = A(I-A- J C) = A(I- S), 

where S — A *C, 

We have seen in Chapter 3 that the product of invertible maps is invertible; 
therefore it suffices lo show that I — S is invertible. We see that it suffices to show 
that the nuilspace ot'I — S is trivial. Suppose ttoi; ihat is, (I - S)x = 0, x ^ 0. Then 
X — S,t; using the definilion of the norm of S, 


sx||<||s||j|.v||. 
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Since x / 0, it follows that 


1 < II S i ； 


(39) 


But according to pari (iii) of Theorem 14, 

|IS|| = »A^C||<|| A^ 1 ||||CI| < 1, 

where in the last step we have used inequality (38): | C | < 1/ || A [ 广 1 , This 
comradicts (39). □ 

Note. In this proof we have used the finite dimensionalily of X. A proof of 
Theorem 15 given in Chapter 15 is valid for infinite-dimerisional X. 

Wc recall now another concept from a rigorous- calculus course: 

Convergence. A sequence of numbers { 叫 } tends to a % 

lim fi^ = a, 

if — ^| tends to zero. Recall furthermore the notion of a Cauchy sequence of 
numbers {^}； it is a sequence for which |" 又 - — 巧 | lends to zero as j and k lend lo oo. 
A basic property of real oumberH is that every Cauchy sequence of numbers 
converges to a limit 

This properly of real numbers is called completeness, 

A second basic notion about real numbers is local compactness: Every bounded 
sequence of real numbers contains a convergent subsequence. 

We now show how to extend these notions and results from numbers to vectors in 
a finite-ciiniensional Euclidean space. 

Definition. A sequence of veciors {x^} in a linear space X with Euclidean 
structure converges to the limit x: 


lim 

t 一 DO 


.V 


if ij Xf- x II tends to zero as k ^ oo* 

Theorem 16, A sequence of vectors {xi} in a Euclidean space X is called a 
Cauchy sequence if || — xj | — 0 as t and j oc* 

(i) Every Cauchy sequence in a finite-dimensional Euclidean space converges to 
a limit. 

A sequence of vectors {x k } in a Euclidean space X is called bounded if || ||< R 

for all A, R some real number 

(ii) In a finite-dimensional Euclidean space every bounded sequence contains a 
convergent subsetjuence. 
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Proof (i) Let X and y be two vectors io X. Uj and bj their jih component; then 

K - %| < || ^ - ||. 

Denote by the ;th component of Since { 抑 } is a Cauchy sequence, it follows 
that the sequence of numbers {dkj} also is a Cauchy sequence. Since the real 
numbers are complete, the {a^ j \ converge to a limit Denote by x the vector whose 
components are (a\ ， . " From ihe definition of Euclidean norm, 

II II 2 — [ kiv — Gjl 2 ， （刪 

[ 

it follows that lim 而 = 

(ii) Since < | ||. it follows that \a^j\ < R for all k. Because the real 

numbers are locally compact, a subsequence of {«▲」} converges to a limit a\. 

This subsequences of A — s contains a further subsubscqucncc such that 
converges to a limit az ，Proceeding in this fashion we can construct a subsequence of 

{xk} for which all sequences converge to a limit ^ I__ ,rh where tt is the 

dimension of X t Denote by x the vector whose components arc " ^a n ). From 

(40) we deduce lhat the subsequence of {x^} converges to x. □ 

It follows from pari (ii) ol' Theorem 16 ihat the supremum in the definition (32) of 
|| A || is a maximum: 

|| A ||= max II A.v ||, (32) / 

IN=i 

It follows from (he definition of supremum that | A || cannot be replaced by any 
smaller number that is an upper bound of || Ajc || f | || = K It follows that there is a 

sequence of unit vectors {xt}, \ Xf- || — K such that 

lim || 如 || = || A I 卜 

According io Theorem 16, this sequence has a subsequence that converges to a limit 
x. This vector r maximizes || A ： || for all unit vectors 二 
Part (ii) of Theorem 16 has a converse: 

Theorem 17， Let X be a linear space with a Euclidean structure, and suppose 
that it is locally compact^ — that is，that every bounded sequence {xt} of vectors in X 
has a convergent subsequence. Then X is finite dimensiona]. 

Proof, We shall show that if X is not finite dimensional, then it is not locally 
compact. Not being finite dimensional means that given any linearly independent set 
of vectors vi ? …，私 there is a vector that is not a linear combination of them. In 
this way we obtain an inlinite sequence of vectors ) ， | ，办 … such that every finiieset 
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{ 少 1， … ，>、} is linearly iodependeriL According to Theorem 4 t we can apply (Ke 
Gram-Schmidt process to construct pairwise orthogonal unit vectors {.Vh …” 
which are linear combinations of Vj …”如 Fur this infinite sequence 

II H II 2 = II 知 III 2 - 2(x k ,xj) + || Xj |p = 2 

for all k ^ j. Therefore this sequence, which is bounded* contuin^ no convergent 
subsequence* □ 

Theorem 17 is a very useful and there lore important criterion for a Euclidean 
space to be finite dimensional. In Chapter 14 we shall show how to extend it k> nil 
normed linear spaces. 

In Appendix 12 we shall give an imeresting application. 

Definition, A sequence {A f[ j of mappings converges to a limit A if 

lim |[ A" - A ]| = 0, 

Exercise 7, Show Lhat {A；；} converges lo A iff for all x, \ tt x converges to At* 

Note. The result in Exercise 7 does not hold in iDtinite-dimcnsional spaces. 

We conclude ihis chapter by a brief discussion of complex Euclidean strucLure. In 
the concrete definition of complex Euclidean space, definition (2) of the scalar 
product in K fI has to be replaced in hy 

= 53 袖， （ 41 ) 

where the bar "denotes ihe complex conjugtite. The definition of the adjoinr of a 
matrix is as in (23y, but in the complex case has a slightly dilTerem interpretaiion. 
Writing 


A = ( 卿 ) ， {Ax)^ 


and using the (41) definition of scalar product we can write 

(A-v. aijX^jui. 

This can be rewritten as 






which shows that (Ax, u) — (x. A + w)，where 

(A〜) 广 I] 
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that h. the adjoint A'* of the matrix A is the complex eonjugaw of the transpose of A, 
We now define ihc ahstract notion of n complex Euclidean space. 

Definition. A complex Euclidean struclure in a linear spacti X over the complex 
numbers is furnished by a complex valued function of two vector arguments, called a 
scalar product and denoted as (x ， y), with these properties: 

(i) (x 9 v) is a linear funciion of x for y fixed, 

(ii) Conjugate symmetry: for all x, y, 

(^v) = (V), (42) 

Note that conjugate symmetry implies that (x\ .v) is real for all x. 

(iii) Positivity: 

(x s x) > 0 for all jt 笋 0. 

The theory of complex Eudidcan spaces is analogous to that for real ones, with a few 
changes where necessary. For example, it follows from (i) and (ii) that for x fixed, 
(x^ y) is a skew linear function of j. that is, additive in y and satisfying for any 
complex mini be r k, 

(x ， 矽） 二 (43) 

Instead of repeating the theury，wc imJicaic those places where a slight change is 
needed. lo the complex case identity (12) J is 

ll^ + y || 2 = ll^ll 2 + (^y) + (y,x)+||y || 2 

= ||,t!| 2 +2Rc(^v)+||>'|| 2 , (44) 

where Rc k denotes the real part ol' the complex number k. 

Exercise 8 . Prove the Schwarz inequality for complex linear spaces with a 
Euclidean structure, 

Excrc ise 9 . Prove the complex analogues of Theorems 6 , 7, and 8 * 

We dclinc the adjoint A of a linear map A of an abstract complex Euclidean 
space into itself by relation {23/ as before: 

(x, A^u) = (Aa t w)* 

Exercise io. Prove the complex analogue of Theorem 
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We define isometric maps of a complex Euclidean space as in the real case: 

II Mx I 卜 |M|. 

Definition, A linear map of a complex Huclidean space inio itself that is 
isometric is called unitary. 

Exercise u. Show that a unitary map M satisfies the relations 

I (45) 

and, conversely, that every map M that satisfies (45) is unitary* 

EXERCISE 12 _ Show that if M Is unitary, so is M ^ 1 and M 9 

Exercise 13 , Show that ihe unitary maps form a group under multipHcation. 

Exercise 14 * Show that for a unitary map M, |deL M [— L 

Exercise 15 , Let be the space of continuous complex-valued funclions on 
[一 J ， 1] and define ihc scalar product in X by 

(frg)= [ f(s)g(s)ds. 

Let be a continuous function of absolute value 1: |m(s)| = 1,-1 < s < L 
Define Af to be multiplication by m: 

⑽ .)( 5 ) = m(s)f(s). 

Show that M is unitary. 

We give now a simple but useful lower bound for Ihe norm of u matrix mapping a 
complex Euclidean space X into itself. The definition of the norm of such a matrix is 
the same as 10 the real case, given by equation (32/: 

II A II = ma^ II Aa]|- 

IW 卜 1 

Let A be any square matrix with complex entries, h one of its eigenvectors, 
chosen to have length 1 , and a the eigenvalue: 

Ah = ah, II h [| = 1. 


Then 


Ah j ^|| ah II = \a\. 
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Since || A || is the maximum of || A.v || for all unit vectors x, it follows that 
I A |j > \a\. This Is iruc for every eigenvalue; therefore 

|| A |[ > max \ail (46) 


where the c" range over all eigenvalues of A. 


Definition. The spectral radius r(A) of a linear mapping A of a linear space 
into itself is 


r(A) — max | 巧 |， (47) 

where the aj range over all eigenvalues of A, So (46) can be restated as follows: 

II A || > r(A). (48) 

Recall that the eigenvalues of the powers of A arc (he powers of the eigenvalues 
of A: 


A J h - a f h. 

Applying (48) to A ; \ we conclude that 

II || > r ㈧' 

Taking the yth root gives 

IA^||^>r(A). (48) ; 


Theorem 18, As j tends to oc ， (48\ tends to be ao equality; that is, 

lim || A J || 1 力 ■= r(A). 

j-*oc 


A proof will be furnished in Appendix 10, 

We shall give now a simple and useful upper bauod for the norm of a real m x n 
matrix 

A = ? 

mapping U n into For any x in set Ax = y f y in The components of are 
expressed in terms of the components of x tis follows: 


yi^Y, ai J x r 
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Estimate the right-hand side using the Schwarz inequality: 




内） < (^34)( [彳) 


adding nil these inequalities, i = 1, * * *. m, we get 


E.vf^(E4)l ： 


(49) 


Using the definition of norm in Euclidean space [see equation (J)|，we can rewrite 
inequality (49) as 




(0 


11^ IP* 


Take the square root of this inequality; since v ^ A.v, wc can write it as 


ii ^!i< (5Z4-) ii x ii- 


(50) 


The definilion of the norm || A || of Ihe matrix A is 

sup tl Ajf II, IMN 1. 


It follows from (50) that 


I A || < 




1/2 


(51) 


this is the upper bound for | A || we set out to prove. 


Exercise 16, Prove the following analogue of ( 51 ) for matrices with complex 
entries: 


II a h (^>g 2 ) 




(sir 


Exercise 17 . Show lhat 


^2 [%I 2 = trAA 


(52) 
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Exercise 18 . Show that 

IrAA 4 = tr A^A* 

Exercise 19. Find an upper bound and a lower bound for the norm of the 2 x 2 
matrix 

A = 

The quantity (1 ：|%| 2 ) 1；： is called the Hilbert-Schmidt norm of the matrix A. 

u 

Let T denote a 3 x 3 matrix, its colomiis x, \\ and 

T = {x,y.z)- 

The determinant of T is, for a and v fixed, n linear function of z： 

det(x,>sz) = l(z)^ (53) 

According to Theorem 5 T every linear function can be represented a.s a scalar 
product: 

I(Z) = (54) 

where w is some vector depending on and v: 

w = 

Combining (53) and (54) gives 

det (x,y,z) = {w(x%y),z)^ (55) 

We formulate the properties of the dependence of iv on x and y as a series of 
exercises: 



Exlrcisi£ 20 . (i) w is a bilinear function of x and y* Therefore we write vv as a 

product of x and % denoted as 

i v = ^ x )\ 

and called the emss pmdtici. 

(ii) Show that the cross product is antisymmetric: 


y x x = —x x y. 
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(iii) Show that x x y is orthogonal to both x aiul y. 

(iv) Let R be a rotation in R 3 ; show that 


(Ra*) x (Rv) ^ R(.t k y). 


(v) Show lhat 

I ||= ±|| ^ I || y I sin B, 

where <9 is the angle between x and y. 

(vi) Show that 


(SH!)=0 

(vii) Using Exercise 16 in Chapter 5, show that 



Exercise 21. Show ihat in a Euclidean space every pair of vector 

\\u+v\\ 2 + \lu~v\\ 2 =2\\u\\ 2 +2\\v\\l (56) 



CHAPTERS 


Spectral Theory of Self-Adjoint 
Mappings of a Euclidean Space 
into Itself 


In this chapter we shall study mappings A of Euclidean spaces into themselves that 
arc self-adjoint~that is, arc their own adjoints: 


A" = A. 

When A acts on a real Euclidean space, any matrix representidg it in an urthonarmal 
system of coordinaies is symmetric, that is. 




Such mappings are therefore also called symmetric. When A acts on a complex 
Euclidean space, its matrix representations arc conjugaie symmetric; 

A -* -- 


Such mappings are also called Hvrmiiean. We saw in Theorem i I of Chapter 7 Ukh 
orihogonal projections arc selt-adjoim. Below we describe another large class of 
self-adjoint matrices* In Chapter 11 we shall see that matrices that describe the 
motion of mechanical systems arc self-adjoint. 


Definition. Let M be an arbitrary linear snapping in a Euclidean space, Wc 
define its self-adjoint part as 


M + NT 
M v = — ^ — 


⑴ 
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Exercise i. Show that 

Rc(x ， Mv) = (2) 

Let (^i,, .jc^} =/(x) be a real-valued twice-differemiable function of n real 

variables Xi ， … .x ti written as a single vector variable x. The Taylor approximation 
to/ at a up to second order reads 

/(« + y) =f(a) + t(y) +^Cv) + l|v|| 2 f(l|v||)» (3) 

where f(d) denotes some fund ion thai tends to 0 as d 0, l(y) h a linear runciionof 
y\ and is a quadratic runction. A linear function has Ihe form (see Theorem 5 of 
Chapter 7) 

t(y) ^ ⑷ 

g is the gradient of /at a; according to Taylor's theorem 

S；= 

The quadratic function q has the form 

^{y )= 

The matrix (hij) is called the Hessian H of f: according to Taylor's theorem, 



Employing matrix notalion and the Euclidean scalar product, wc can write q, given 
by (4), in the form 

q(y) = (y ， Hy). (8) 

The matrix H is self-adjoint^ that is, = H r 

% = hjt\ (9) 

this follows from definition (5), and the fact that the mixed partials of a twice- 
differentiable function are equal. 

Suppose now thai a is a critical point of the funclion/, that is where grad f — g 
is zero. Around such a point Taylor's formula (3) shows that the behavior of / is 
governed by the quadratic term. Now the behavior of functions near critical points 






( 5 ) 


( 6 ) 
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k of fundamental importance for dynamical systems, as well as m geometry: this is 
whal gives quadratic functions such an important place in mathematics and makes 
the analysis of symmetric matrices such a central topic in linear algebra. 

To study a quadratic function it is often useful to introduce new variables: 

Lj = z, (10) 

where L，is some invertible mairix, in terms of which q has a simpler form. 

Theorem 1. (a) Given a real quadnuic form (6) it is possible lo change 
variables as in (10) so that in terms of the new variables, 4 g is diagonal, that is, of 
the form 


n 

q{ir x z) = ( 】 i) 

i 

(b) There are many ways to introduce new variables which diagonalize q\ 
however, the number of positive, negative, and zero-diagonal terms di appearing in 
(11) is the same in all of [hem. 

Proof Part (a) is entirely elementary and construclive. Suppose thal one of the 
diagonal elements of q is nonzero, say h\i ^ 0. We then group together all terms 
containing ji : 


Q(y) = AnJi + J] + ^ } Wj- 

3 1 


Since H is symmetric, hj] — hif so we can write q as 


/in 






Set 


y\ + 


( 12 ) 


We can then write 


= hu^ + g 2 {y)- 


( 13 ) 


where ^2 depends only on 乃 ，… 

If all diagonal terms of q are zero but there is some nonzero off-diagonal term，say 
hi 2 = /f 2 i #()，then we introduce yi +yj and y { — 乃 as new variables, which 
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produces a nonzero diagonal term. If all diagonal and off-diagonal terms are zero* 
then = 0 and there is nothing Ui prove* 

We now apply induction on the number of variables n; using (13) shows that if the 
tjuadralic funclion <72 in (« 1) variiibles can be written io l'orm (11), then so can q 

itself. Since are related by an invertible matrix to g” ",it follows 

from (12) that the full set y is related to z by an invertible matrix, □ 

Exercise 2, We have described above an algorithm for diagonalizing q \ 
implemetu it as a conipuier program. 

We turn now to part (b); denote by p+ % p_, and /；o the number of terms in (1 ]) that 
are positive, negative, and zero, respectively. We shall look at the behavior of q on 
subspaccs S of Wc say that q is positive on the subspacc S if 

q{u) > 0 for every f/ in S, u / 0 + (14) 

Lemma 2 + The dimension of the largest subspace of W on which q is positive 

kp 十： 

p + = max dim S, q positive on S + (15) 

Similarly, 

dim S, q negative on S* (15)^ 

Proof. We shall use representation (11) for q in terms of the coordinates 
z[, " * y&i; suppose we label ihem so that di” ■ ■ ， d p are positive, p ― , the rest 

nonpositive. Define the subspace S + to consisl of all veciors for which 
Zp^t = … =in — 0. Clearly dim — and equally clearly, q is positive on 

This proves that p^. h less than or equal to the right-hand side of( 15). We claim 
thai ihe equality holds. Let S be any subspace whose dimension exceeds pi. 
For any vector a in define P lt as the vector whose p. components arc the same as 
the first components of u, and the rest of (he components arc zero. The 
dimension /j + of the target space of this map is smaller than the dimension of the 
domain space i’. Therefore, according to Corollary A of Theorem 2, Chapter 3 t 
there is a nonzero vector? in the nullspace of R By definition of P, the first p 十 of the 
^-components of this vector y arc zero. But then it follows from (II) that <y(j) < 0; 
this shows that q is not positive on S. This proves (15); the proof of (15/ is 
analogous. □ 

Lemma 2 shows that the numbers and p + can be defined in terms of the 
quadratic form q itself, intrinsically, and art there fore independent of the special 
choice of variables that puts q in form (11). Since + p + pq = n, this proves part 
(b) of Theorem K □ 
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Part (b) of Theorem 1 h called the law of inerikL 

Exbucise 3 . Prove that 

p+ + po — max dim S, q > 0 on S 

and 

+ py — max dim S, cf <0 on S. 

Using form ( 6 ) of £7 we can reinterpret Theorem I in matrix terms. It is convenient 
for this purpose lo express y in terms of rather than the other way around as in ( 10 ). 
So wc multiply (10) by obtaining 

y = M:’ （ 16) 

where M abbreviates L 、 Setting (16) into ( 8 ) gives, using the iidjoint of M* 

q(y) = (y, Hv) = (M- ? HM-) = (z, NTHM:), (17) 

Clearly, q in terms of s is of form (] ]) iff NT HM is a dingonal matrix. So part (a) of 
Theorem 1 can be put in the following form: 

Theorem 3* Given any real self-adjoint matrix H, there is a real invertible 
matrix M such that 

M + HM = D, (] 8 ) 

Da diagonal matrix. 

For many applications it is of utmost importance to change variables so that the 
Euclidean length of the old and the new variables is the same; 

ILvl! 2 -IU!l' 

For the matrix M in (16) this means that M is an isometry. According to (28) of 
Chapter 7. this is the case iff M is orlhogonaK that is, satisfies 

M，M = 1\ (19) 

It is one of the basic theorems of linear algebra, nay, of mathematics itself，that given 
a real-valued quodnilic form q, it is possible lo diagonalize it by an isometric change 
of variables. In matrix language, given a real symmetric matrix //, there is a real 
invertible matrix M such (hat both (18) ami (19) hold. 

Wc shall give two proofs of this important result The first is based on the spectral 
theory of general matrices presented in Chapter 6 , specialized to self-adjoint 
mappings in complex Euclidean space* 
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We recall from Chapter 7 (hat the adjoint H* of a linear map H of a complex 
Euclidean space X into itself is defined by requiring that 

(Rv,j) = (x,H» (20) 

hold for all pairs of vectors Here the bracket (,) is the conjugate-symmetric 
scalar product introdued at Ihe end of Chapter 7. A linear map H is called 
self-adjoint if 


For H self-adjoint, (20) becomes 

(Mx.y) = (>，_♦ 


( 20 )' 


Theorem 4* A self-adjoint map H of complex Euclidean space X into itself has 
real eigenvalues and a set of eigenvectors that form an orthononnal basis of 兄 

Proof, According lo the principal result of speclral theory, Theorem 7 of Chapter 
6, the eigenvectors and generalized eigenvectors of H span X, To deduce Theorem 4 
tVom Theorem 7, we have to show that a self-adjoint mapping H has the following 
additional properties: 

(a) H has only real eigenvalues. 

(b) H has no generalized eigenvectors, only genuine ones. 

(c) Eigenvectors of H corresponding to different eigenvalues are orthogonaL 

(a) If a + ib is an eigenvalue of H, then ih is an eigenvalue of H - uL also self 1 
adjoint Therefore, it suffices to show that a self-adjoint H cannot have a purely 
imaginary eigenvalue ib. Suppose it did, with eigenvector z: 

Hz — ibz- 

Take the scalar product o( both sides with z: 

(Hz.z) — (ihz^z) — ib(z^z)- (21) 

Setting both x and v equal io s in (20/, we get 

(Ez,z) ^ (21/ 

Since the scalar product is conjugate symmetric, we conclude that the two sides of 
(21are conjugates* Since they are equal, the left-hand side of {21} is real- Therefore 
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so is the right-hand side; since (z, z) is positive, this can be only if h — 0, as asserted 
in (a), 

(b) A generalized eigenvector z satisfies 

= 0 ; ( 22 ) 

here we have taken the eigenvalue to be zero, by replacing H with H — a\. We want 
to show that then ^ is a genuine eigenvector: 

Hz = (X 

We take first the case d = 2: 

H 2 z = 0 ： 

we take the scalar prodocl of both sides with zi 

(H 2 z,z) = 0, 

Using (20/ with x = Hz, y = z, we get 

(H 2 z,z)-(H^Hz)=|| Hz || 2 ; 

using (23/, wc conclude that || (| — 0, which, by positivity, holds only when 

Hz = 0, 

We do now an induction on d\ we rewrite (22) as 

H-H £/ ' 2 z 0. 

Abbreviating lr~^z as w, we rewrite this as H 2 m ，= 0: this implies, as we have 
already shown, that Hw = 0. Using the definition of this can be written as 

H iil z = 0, 

This completes the inductive step and proves (b). 

(c) Consider two eigenvalues a and of H，u / fc: 

H.v = aXj Hy — by. 

We form the scalar product of the first relation with y and of the second with x; since 
b is real we get 


(22)， 

(23) 

(23/ 


(Rv T v} = (x. Hy) = h(x,y). 
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By (20/ the left-hand sides are equal; therefore so are the right-hand sides. But for 
a 寺 b this can be only if (je, v) — 0, This completes the proof of (c), □ 

Definition, The set of eigenvalues of H h called the spectrum of H, 

We show now that Theorem 4 has the consequence that real quadratic forms can 
be diagonalized by real isometric iransformalion. Using the matrix formulation 
given in Theorem 3, we state the result as follows 

Theorem Given any real self-adjoiru matrix H, there is an orthogonal matrix 
M such that 

M + HM = D. (24) 

D a diagonal matrix whose entries are ihe eigenvalues of H. M satisfies M'M 二 L 

Proof. The eigenvectors/of H satisfy 

Hf = af, (25) 

H is a real matrix, and according to (a), the eigenvalue a is real. It follows from (25) 
that ihc real and imaginary parts of/also arc eigenvectors, II follows from this easily 
that we may choose an orthonormal basis consisting of real eigenvectors in each 
eigenspace N lt , Since by (c), eigenvectors belonging to distinct eigenvalues are 
orthogonal, we have an orthonormal basis of X consisting of real eigenvectors^ of H. 
Every vector v in X can be expressed as a linear combination of these eigenvectors: 

(25/ 

For y real, the zj are real. We denote the vector with components zj as z: 
s = (zi, … ， in、- Since the {fj} form an orthonormal basis, 

imi 2 -E z 卜 “ " 2 

Letting H act on (25)' we get, using (25), that 

Hy = zja/j. 

Selling (25) and (25)’ into (6) we cm express the quadratic forin q as 

?()，) = ( 夕， H>_} = [ a 秦 (26/ 

This shows that the introduction of the new variables z diagonalizes Ihc quadratic 
form q. Relation (26) says that the new vector has the same length as the old 


(26) 

(25)" 
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Denote by M the relation ofz to y: 

y = Mz* 

Set this into (26)’： we gel 

q(y) = (y, Hy) = (M^HMz) = (z,M*HMz). 

Using (26/, we conclude that M’HM = D，as claiineiJ iti (24). This completes the 
proof o( Theorem 4\ □ 

Multiply (24) by M on the left and on the right. Since MM + also equals I for 
an isomeiry M» wc gel 

H- MDM". (24/ 

Exercise 4 # Show that the columns of M are the eigenvectors ol' FL 

We restate now Theorem 4 T the speclral theorem for self-adjoint maps, in a 
slightly ciifl'erent language. Theorem 4 assert h that the whole space X can be 
decomposed as the direct sum of pairwiM orthogonal eigen spaces: 

X = N ㈧ & …㊉ N ik 、， (27) 

where N (lJ - consists of eigenvectors of H with real eigenvalues _ a! for j ^ L 
That means that each a in X can be decomposed uniquely as the sum 

x = J l} + … ⑷， （27/ 

where belongs to N^K Since N ”、consists of eigenvectors* applying H to 
(27V gives 

Hr = “|jr ⑴ + . ■ • + a ^ k ' } , (28) 

Each occurring in (27 / is a function of xi wc denote this dependence as 

Since the N arc linear subspaces of X, it follows that a 屮 depends linearly on x, that 
is, the arc linear mappings. Wc can rewrite (27/ and (28) as follows: 

I = !>/， （29) 

j 

H = (30) 
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Claim: The operators Pj have the following properties: 

(a) p } P k - 0 fory ^^P ； - Pj* (31) 

(b) Each Pj is self-adjoint; 

P/=Pr (32) 

Proof, (a) Relations (31) arc immediate consequences of the dctinilion of 
(b) Using the expansion (27) # for jc and the analogous one for y we get 

{Pjx,y) = = („), 

where in ihe Iasi step we have used the orthogonality ofN^ iojc ⑺ for j ^ i. Similarly 
we can show that 

(x, Pjy) = 

Putting the two together shows that 


{Pjx.y) = (x^jy). 

According to (20), this expresses the self-adjointness of Pj. This proves 
(32), □ 

We recall from Chapter 7 that a self-adjoint operator P which salislies P" = P is 
an orthogonal projection. A decomposition of the form (29), where the P ； satisfy 
(31 )，is called a resolution of the icieniity. H in form (30) gives the spectral resolution 
__ 

We can now restate Theorem 4 as 

Theorem 5, Let X he a complex Euclidean space, H: X ^ X a self-adjoint 
linear map. Then there is a resolution of ihe identity, in ihe sense of (29) ，（ 31)，and 
(32) that gives a spectral resolution (30) of H, 


The restated form of the spectral theorem is very useful for defining funclions of 
self-adjoint operators. We remark lhal its greatest ini port ance is as the model for Ihe 
infinite-dimensional version. 

Squaring relation (30) and using properties (31) of the P ； wc get 
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By induction, for any natural number m. 

It follows that for any polynomial p, 

/.(H) = ^M^)Pr (33) 

Let f(a) be any real valued fynciion defined on ihc spectrum ofH, We define^H) by 
formula (33): 

/(H) = (33/ 

An example: 


e ^ = j2 e ai pj. 


We shall say more about this in Chapter 9. 

We present a series of no-cost extensions of Theorem 5. 

Theorem 6, Suppose H and K are a pair of self-adjoint matrices that commute: 
H* = H, K + = K, HK= KH. 

Then they have a common spectral resolution, that is, there exist orthogonal 
projections stitisfying (29), (31), and (32) so thal (30) holds, as well as 

J2 b J p j = K - ⑽ 


Proof. Denote hy one of the cigcnspaces of H; then for every x in N 

Ha — ax. 


Applying K, we get 


KHa = aKx. 

Since H and K commule* we can rewrite this as 

HKa =aICv, 

which shows ihni K_v is an eigenvector of U. So K maps jV into itself. The restriction 
of K Lo is scir-adjoinl. We now apply spectral resolution of K over /V; combining 
all these resolutions gives the joint spectral resolution of H and K* 口 
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This result can be generalized to any finite collection of pairwise commuting 
self-adjoint mappings* 

DefimiiotL A linear mapping A ol' EucHdciin sptice into itself is called anti- 
self-adjoint if 


A 4 二 -A, 


H follows from the definition of adjoint and the property of conjugate symmetry 
of ihc scalar product that for any 1 incur map M of a complex Euclidean space into 
itself, 


(iMf = 一 /M' (34) 

In particular, if A is anli-self-adjoint t fA is self-adjoint, and Theorem 4 applies. This 
yields Theorem 1, 


Theorem 7, Lei A be ;m anli-sell-adjoint mapping of a complex Euclidean 
space into itself. Then 


(a) The eigenvalues of A are purely imaginary. 

(b) We can choose an orlhonormal basis consisting of eigenvectors of A. 

Wc introduce now a class of maps that includes self-adjoint, anri-sclf-adjoint, and 
unitary maps as special cases- 

Definition, A mapping N of a complex Euclidean npucc into itself is called 
normal if it commutes with its adjoint: 

NN* =N f N. 


Theorem 8. A normal map N has an orthonormal basis consisting of 
eigenvectors. 


Proof. IfNandN^ commute, so do 


N + N + 
~ 2 ~ 


and 



N -N 事 
2 


(35) 


Clearly, H is adjoint and A is a mi-self-adjoint. According to Theorem 6 applied to H 
and K — /A, they have a common spectral resolution, so that there is an oilhonormal 
basis consisting of common eigenvectors of both H and A. Bui since by (35), 

N = H + A, (35)' 

it follows that these are also eigenvectors of N us well as of N' □ 
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Here is an application of Theorem 8. 


Theorem 9* Let U be a unitary map of a complex Euclidean space into itself, 
that is. an isomclric linear map. 

(a) There is an oithonorma] basis consisting of genuine eigenvectors of U. 

(b) The eigenvalues of U are complex numbers of absolute value = l. 

Proof. According to equation (42) of Chapter 7, an isometric map U salisfies 
U^U = 1. This relation says lhat U' is a left inverse forU. We have shown in Chapter 
3 (see Corollary B of Theorem 1 there) that a mapping that has a left inverse is 
invertible, and its left inverse is also its right inverse: ULT = 1+ These relations show 
that U commutes with U ’； thus U is normal and Theorem 8 applies, proving part (a). 
To prove part (b), let / be an eigenvector of U, with eigenvalue u : Vf = uf. It 
follows that I U/ || — |[ uf || = |n| || / ||* Since U is isometric, u\ — 1. □ 

Our first proof of the spectral resolution of self-adjoint mappings is based on the 
spectral resolution of general linear mappings. This necessitates the application of 
the fundamcmal theorem of algebra on the existence of complex mots, which then 
are shown to be real. The question is inescapable: Is it possible to prove the 
spectral resolution of sell-adjoint mappings without resorting to Ihc fundeinentai 
theorem of algebra? The answer is “Yes.” The new proof, given below, is in every 
respect superior to the first proof. Not only does it avoid the fundamental theorem 
of algebra, but in the case of real symmetric mappings it avoids the use of complex 
numbers. 1( gives a variational characierizalion of eigenvalues that is very useful i a 
estimating the location of eigenvalues; this will be exploited systematically in 
Chapter 10. Mo^t important, ihc new proof can be carried over to iofinilc- 
dimensional spaces- 


Second Proof of Theorem 4. Wc start by assuming that X has an orthonormal 
basis of eigenvectors of H, We use the representations (26) and (26 / to write 

(j. Hjc) ^ 

We arrange the a, in increasing order: 

at <a 2 < - ^ < ct n , (36 / 

li is clear from (36 / that choosing z\ # (t and all ihe oiher — 2,.,. = f) t 
makes {36) as small as possible. So 


0, Ha) 
= nun—— 
i#o (x } x) 


( 37 ) 
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Similarly, 


, ( 义， Ht) 

a it — Kinn -- « 

(x 7 x) 

The minimum and maximum, respectively* arc taken on at points x —f that urc 
aigenveciors of H with eigenvalues a\ and a n , respectively. 

We shali show now, wiihom using the representation C36), that the miniiiium 
problem (37) has a solution and that this solution is an eigenvector of H. From this 
we shall deduce, by induction, that H has a full set of eigenvectors. 

The quotient (36) is called the Rayleigh c/uotient of H and is abbreviated by 
K = /?h^ The numerator is abbreviated，see (6)，as q\ wc shall denote the 
denominator by 




Since H is self-adjoint, by (21 is real-valued; furthermore, R is a homogeneous 
function of a of degree zero, that is t for every scalar k f 

R(kx) = R(x), 

Therefore in seeking its maximum or minimum, it suffices to confine the search to 
the unit sphere || || — I. In Chapter 7, Theorem 15, we have shown that in a finiie- 

dimensional Euclidean space X, every sequence of vectors on the unit sphere has a 
convergent subsequence. It follows that R{x) takes on its minimum at some point of 
the unit sphere; call this point/. Let g be any other vector and / be a real variable; 
R(f + tg) is the qiiolicnt of two quadratic funclions of f. 

Using the self-adjointness of H and the conjugate symmetry of the scalar product 
wc can express R(f + !g) os 


n(f + tg) 


(/,Hf) + 2tRe( R Mf )^ ^q(t) 

(fj) + 2tRe(g 1 f) + i 2 (g,g) ~ p(t) 


(38) 


Since R achieves its mini mum at f\ R{f + tg) achieves its minimum at r = 0; by 
calculus its derivative there is zero: 


询) 



hp - qp 



Since ||/ || = ],/> — ]; denoting R{f) — min R by cl we can rewrite the above as 

R — q — ap — i), (38)’ 
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Using (38), we get readily 


^(f+^)U-2Re(^H/) T 

p(f+f«)|,=o = 2Re(^/), 

Setting this into (38/ yields 


2Re(g,H/-a/)-0. 

Replacing g by ig we deduce that for all g in X, 

2{g, Uf ^ of) - 0, (39) 


A vector orthogonal U】all vectors g is zero ： since (39) holds for all g f it follows that 


H/ - qf = 0 ， （ 39/ 

that is,/is an eigenvector and a is an eigenvalue of H* 

We prove now by induction on the dimension n of X that H has a complete set of n 
orthogonal eigenvectors in X, We consider the orthogonal complement Xi, off. lhat 
is, all ^ such i\m 


( x ，/) = 0+ (39, 

Clearly, dimXi — dim X — \. Wc claim that H maps the space X\ into itself; that is, 
if x e X[, then (Hjc, /) = 0. By self-adjointness and (39 )"， 

(Hx, f) = (x,Hf) = (x,f/f) =a(x, f) ^ 0, 

H restricted to X[ is self-adjoint: since dim Xi — n — K induction on the 
dimension of the underlying space shows that H has a full set of eigenvectors on X, 
These together wiLh / give a full set of n orthogonal eigenvectors of H on X. 
Instead of arguing by induction we can argue by recursion; we can pose the same 
minimum problem in X| that wc have previously posed in Ihe whole space, to 
minimize 


“% Ha) 

(x ， x) 

among all nonzero vectors in X\. Again this minimum value is taken on by some 
vector x —/2 in X!, and /2 is un eigenvector of H. The corresponding eigenvalue 
is ail 


Wl=02fl 
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where a 2 is the second smallest eigenvalue of R In this fashion we produce 
successively a fyll set of eigenvectors. Notice that the jih eigenvector goes with the 
jth eigenvalue arranged in increasing order. □ 

In the argument sketched above, the successive eigenvalues, arranged in 
increasing order, are calculated through a sequence of reslricted niininium 
problems. We give now a characlerizalion of the jih eigenvalue that makes no 
reference to ihe eigenvector?; belonging to the previous eigenvalues. This 
characterization is due to E. Fischer, 

Theorem 10. Let H be a real symmetric linear map of a real Euclidean space X 
of finite dimension. Denote the eigenvalues of H, arranged in increasing order, by 

+, 0 抑 * Then 


as — mm max -- 

dim S=j jinS- x^Q (a\A ) 


(40) 


S linear subspaces of 

Note. (40J in called the minnw.x principle. 


Proof, We shall show that for any linear subspace S o( X of dim S — j, 

Hx) 


max 


inS (x ? x) 


> 


(41) 


To prove this it suffices to display a single vector .v ^ 0 in S tor which 

(x,Hx) 


(A、4 


> cij, 


(42) 


Such an x is one that satisfies the j — 1 linear condilions 


W) = 0. 


1’ …， J — 1， 


(43) 


where/ is the ith eigenvector of H. It follows from Corollary A of Theorem 1 in 
Chapter 3 that every subspace S of dimension j has a nonzero vector satisfying 
j - 1 linear conditions (43 夂 The expansion (25) of such ail x in terms of the 
eigenvectors of H contains no contribution from the first j - 1 eigenvectors; that is, 
in (36), for i < j. It follows then from (36) that for such x\ (42) holds* This 

completes the proof of (41 )* 

To complete the proof of Theorem 10 wc have to exhibit a single subspacc S of 
dimension j such that 




(冬 Hx) 


( 44 ) 
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holds for all a in S. Such a subspace is the space spanned by /■ t … ， 办 Every x in this 
space is of form ) Zifjl since a, < £tj fbr i <j, inequality (44) follows 
from (36). □ 

The calculations and arguments presented above show an important property of 
the Rayleigh quotient: 

(i) Every eigenvector h of H is a critical point of that is，the first derivatives 
of /?n(Xi are zero when ^ is a[i eigenvector of H, Conversely, the 
eigenvectors are the only critical points of /f h (x ). 

(if) The value of the Rayleigh quotient at an eigenvector/is the corresponding 
eigenvalue of H: 


Rnif) = « when H/= «/• 

This observation has the following important consequence: 

Suppose g is an approximation of an eigenvector/within a deviation of e: 

IU-/I。. (45) 

Then Rh(s) is approximation of the eigenvalue a within o(e 2 ): 

\fiH(g)-a\<Q(e 2 ) r (45/ 

This result is a direcl consequence of the Taylor approximation of the function 
/?h(jt) near Lhe point x — f\ 

The estimate (45/ h very useful for devising numerical methods to calculate the 
eigenvalues of matrices* 

We now give ;i useful extension of the varialional characierization of the 
eigenvalues of a self-adjoint mapping* In a Euclidean space X, real or complex, we 
consider two self-adjoint mappings, H atid M; we assume that the second one, M，is 
positive. 

Definition, A self-adjoint mapping M of a Euclidean space Xinto itself is called 
positive if for all nonzero in X 


(X ， M.v) > 0, 


It follows from the defioilion and properties of scalar product that the identity I is 
positive. There arc many others; these will he studied syslcmatically in Chapter 1(), 
We now form a genenilization of the Rayleigh quotient: 




(X ， Ha ) 

JxMxj 


( 46 ) 
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Note that when M = I， we are back at the old Rayleigh quotient We now pose for 
the generalized Rayleigh quotient the same minimum problem that we posed before 
for the original Rayleigh quotient: Minimize that is, find a nonzero vector x 

that solves 


min 


(x t Ha) 
(x, M.v) 


(47) 


Exercise 5 , (a) Show that the minimum problem (47) has a nonzero solution /' 

(b) Show that a solution / of the minimum problem (47) satisfies the equation 

H/- hMj\ (48) 


where the scalar b is the value of the minimum (47). 

(c) Show that the constrained minimum problem 


mm {yMy) 
£v.M/i-(i (y\ My) 




lias a nonzero soluiioti g, 

(d) Show that a solution g of the minimum problem (47〆 satisfies the equation 

cMg, (48f 

where the scalar c is the value of the minimum (47/. 

Theorem 1L Let X be a finite-dimensional Euclidean space, let H and M be 


two sell'-adjoiiil mappings of X imo itself, and lei M be positive. Then there exists a 
basis / 卜 _ f lt of X where eadi f t satisfies an equation of the form 

H /； = / 轉， 

bj real 

(49) 

and 



(fnWj) = 0 

for / 一 j. 



Exercise 6, Prove Theorem 11. 

Ekerhse 7 . Characterize the numbers hi in Theorem 11 by a minimax principle 
similar to (40) + 

The following useful resiill is an immediate consequence of Theorem 11. 

Theorem 11。 Let H and M be self-adjoint, M positive. Then all the eigenvalues 
of M _i H are real. If H is positive, all eigenvalues of M _l H are positive. 
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Ekekcise 8 . Prove Theorem 1 i / , 

Exbucise 9 * Give an example to show that Theorem 1is false if M is not 
posilive. 

We recall from formula <32)’ of Chapter 7 the definition of the norm of a linear 
mapping A of a Euclidean space X into itself, 

II ^ || = max || A.r II, | x | = L 


When the mapping is normal, lhat is，commutes wilh its adjoint- we can express its 
norm as follows. 

Theorem 12* Suppose N is a normal mapping of a Euclidean space X into itself. 
Then 

I N ||= max |/ 1 丄 (50) 

where the are the eigenvalues of N. 

Exbkcise io. Prove Theorem 12. (Hint: Use Theorem 8 .) 

Exercise i i , We define the cyclic shift mapping S, acting «n vectors in C' by 
S (^ ], dj * ■ - * ? ) = I Cl I ^ * j Ufi _] K 

(a) Prove thai S is an isometry in the Euclidean norm, 

(b) Determine the eigenvalues and eigenvectors of S, 

(c) Verify that the eigenvectors are orthogonal* 

Remark. The expansion of a vector v in terms of the eigenvectors of S is called 
the finite Fourier transform of v. See Appendix 9. 

Theorem 13, Let A be a linear mapping of a finite-dimensional Euclidean 
space X into another finite-dimensional Euclidean space U. The norm (| A | of A 
equals the square root of ihe largest eigenvalue of A*A. 

尸 nrw/ I Aa. |p — (A.v, At) = (x. A* Aa), According to the Schwarz inequality, 
the right-hand side is < \\ x \ [| A*A.v ||* It follows that for unit vectorsx, || x ||= I ， 

liA^|| 2 <|| (51) 

A*A is a self-adjoint mapping; according to formula (37)’* we have 


K 11 
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in the standard Euclidean structure? 

(ii) Compare the value of || A || with the upper and lower bounds of || A || asked 
for in Exercise 19 of Chapter 7, 

Exrrcise 13 * What is the norm ot. the matrix 

p 0 一 I) 

\2 3 0 / 

in the standard Euclidean structures of O? 2 and IR\ 


where a mm is the largest eigenvalue of A'A, Combining this with (50)，we conclude 
that II A I 2 < t/ 匪 -To show that equality holds, wc note that for the cigcnvccior/of 
A h A ， A^A/ — and so in the Schwarz inequality which gave (5", the sign of 
equality holds. □ 

Exercise 12 . (i) What is the norm of the matrix 




CHAPTER 9 


Calculus of Vector- and 
Matrix-Valued Functions 


In Section 1 of this chapter \vc develop the calculus of vector- and matrix-valued 
functions* There are two ways of going about it: by representing vectors and 
matrices in terms of their components and entries with respect to some basis and 
using the calculus of number-valued functions or by redoing the theory in the context 
of linear spaces* Here we opt for the second approach, because of its simplicity and 
because it is the conceptual way to think about the subject; but we reserve the right to 
go to components when necessary. 

In what follows, the field of scalars is the real t>r complex numbers. In Chapter 7 
wo defined the length of vectors and the norm of matrices; see (1) and (32) - This 
made it possible to define convergence ol' sequences as follows* 

(i) A sequence Xk of vectors in R" converges to the vector x if 

lim ||jti — A - |] = 0. 

k—oc 

(ii) A sequence Aj, of n x n matrices converges to A if 

Hm ||A a -A]I=0 t 

k—*ao 


We could have defined convergence of sequences of vectors and matrices, 
without imroducing the notion of si^e, by requiring that each component ol' tend 
to the corresponding componeiu of x and, in the case of matrices, lhat each entry of 
tend to the cotrespcmdirig entry of A. Bui using the notion of size introduces a 
simplificatioii in notation and thinking, and is an aid in proof. There is more about 
size in Chapter 14 and 15, 
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1. THE CALCULUS OF VECTOR- AND MATRIX-VALUED 
FUNCTIONS 

Let t(/) be a vector-valued tunc lion of the real varicible / T defined, sny* lbr t in (0, 1}, 
We say that jt (/) is continuous ai Iq if 

lim||x(f) -x(r 0 )|| = 0, ⑴ 




The notion of continuity and difFereiUiability of matrix-valued functions is defined 
similarly. 

The fundamental lemma of differentiation holds for vector- and matrix-valued 
funciions. 

Theorem L If x(t) = 0 for all i in (0, 1), then x(t) is constant. 

ExtRCist k Prove the fundamental lemma for vector valued functions. (Him: 
Show that for every vector y 7 {x(r), v) is constant.) 

We turn to the rules of differentiation. Linearity, (i) The sum of two 
differentiable functions is differentiable, and 


d y 、 d d 
— (x + y) ― — + — 

dt di dr 

(ii) The constant multiple of a differentiable function is differentiable, and 

X ， 

Similarly for rmirix-valued differentiable l : ufictions ， 

_ ~(A(f) + B{t)) = ^A(r) + 
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(iv) If A is independent of/, then we have 

>w= A *_. 

The proof is the same as in scalar calculus. 

For vector- and matrix-valued functions there h a further manifestation of the 
linearity of the derivative: Suppose that / is a fixed linear funcli ⑽ defined tm R" and 
that x(t) is ii differentiable vector-valued function* Then /{^(/)) is i\ differentiable 
function, and 

(2) 

The same result applies to linear functions of matrices. In particular the trace, 
defined by (35) in Chapter 5, is such a linear function. So wc have, for every 
differentiable matrix function A(f), that 

|tr(A(r))^trriA(l)V (2)’ 

The rule (sometimes called the Leibniz rule) for differentiating a product is the 
same as in elemenlary calculus. Here, however* we have at least five kinds of 
products and therefore five versions of rules. 

Product Rules 

(i) The prixluci of a scalar function and a vector function: 

叾 ㈣ 咖⑺] = H 咖)+的) 

(ii) The product of a matrix function times a vector function: 

— [A(f)x(/)] — (&A ⑺ ) ， r ⑴ -H A(r) ^^(0- 

(Hi) The product of two matrix-valued functions; 

^[A(/)B(f)] - |a( 0 B(l) + A(f) jB(t ). 

(iv) The product of a scalar-valued and a matrix-valued runciion: 

^(t{r)A(r)] = — A(f) + k(t)~A{t), 
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(v) The scalar product of iwo vector functions: 

盖(淋 4’))= (兰淋-#}) + ( 〆 ’) 士 (0), 

The proof of all these is the same as in the case of ordinary numerical functions. 
The rule for differentiating the inverse of a matrix function resembles the calculus 
rule for differentiating the reciprocal of a function，with one subtle twist 

Theorem 2* Lcl A(/) be ^ matrix-valued function, differentiable ami invertible* 
Then A^®(f) also is dilTereniiable，and 

兰 A-i = -A-[(^A)A-i. (3) 


Proof. The following identity is easily verified: 

A~ l (t + h) - A _1 (f) = A~ l (t + h)[A(t) - A(r+ h)]A~ l (t). 
Dividing both sides by h and letting h —^ 0 yields (3), □ 

Exercise 2 . Derive formula (3) using product rule (iii). 


The chain rule of calculus says that if / and a arc scalar-valued differentiable 
functions, so is their composite, /(o(r)), and 

■/_)=/ ⑷令， (4) 

where/’ is the derivative of / Wc show that the chain rule fails for matrix-valued 
funclions. Take f(a) — a 2 : by the product rule. 


U 1 

di 


d 


d 


A ^ A+ U A A 


certainly not the same as {4>* More generally, wc claim that for any positive integer 
power A, 


|a^AA-' + AAA- + .^A-A. (5) 

This is easily proved by induction: Wc write 


A k = AA^ 1 
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and apply the product rule 

-A^AA'^+A^A^ 1 , 
dt dt 

Theorem 3* Lcl p be any polynomial，kM A(f) be a square malrix-valucd 
function that is differentiable; denote the derivative of A with respect to / as A. 

(a) If for a particular value of / the matrices A (0 and 入 (f) commute, then the 
chain rule in form (4) holds as /: 

^P( A )= 〆 (A)A. (6) 

(b) Even if AO) and k(t) do not commute* a trace of the chain rule remains: 

^trp(A) ^ tr(/{A)A) t (6)' 

Pro 球 Suppose A and A commuie; then (5) can be rewriuen as 

— A 卜二 JtA* -l A. 

dt 

This h formula (6) for p(s) — s k ; since all polynomials are linear combinations of 
powers, using the linearity of differentiation we deduce (6) for all polynomials. 

For noncommuiing A and A wc take the trace of (5). According to Theorem 6 of 
Chapter 5, trace is commutntive: 

tr(A^AA^" J ) = trfA^^^A) = tr(A^ l A). 

So wc deduce that 

tr~A k = kir(A k=] A), 
at 

Since trace and differentiation commute |see (2/1, we deduce formula ( 6 / for 
P(A = - The extension to arbitrary polynomials goes as before, □ 

We extend now the product rule to multilinear functions M(a 】 ’ … 肩 ）- Suppose 
jq” + + * & are differentiable vector functions* Then , f x k ) is differentiable, 

and 

~M(.v 1? .. = M(x\,x 2 ,-.^x k ) + … + … ,x k ^\,x k ). (7) 

dt 
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The proof is straightforward: since M is multilinear ， 

M(xi(f + /i) ? … ，抑 (r+ /，)）一 

二 M(.ri(r+ /i) -jy (0, + A)” “ } x k (t + h)) 

+ M (^ i (/) 1 jc ：2(# + A ) — 均 (/ + A )，". ，和 ■(, + 办 )） 

+ “ . + M(xs ⑺，“ Xk(i + h) — ^(/))* 

Dividing by h and letling h lend la zero gives (7), 

The most important application of (7) is to the function D, the deternninam, 
defined in Chapter 5: 

^X n ) ^ D(xuX2^^,x tt ) + … + D(jr 】 丄 y ( 8 ) 

at 

We now show how to recast this formula to involve a matrix X itself, not its 
columns. We start wilh the case when X(0) — I, thai is, ^(0) — ej, In (his case the 
determinants on ihe right in ( 8 ) are easily evaluated at I — 0 ; 

D(ii ⑼， e 2 , …〆， ,j =x n ( 0 ) 

D(^|,jri(0), ej..., ,e n ) = 1 . 22 ( 0 ) 

D ( 歉 - ， .》■ ^ —I ^ -ir^ ( 0 )) = x fm (0 j . 

Setting this ituo ⑻ we deduce that if X(/) is a differentiable mairix-valued function 
and X(0) = I, then 

^det X(/)I f .o-trX( 0 ), ( 8 / 

Suppose Y(/) is a differentiable square matrix-valued function, which is 
invertible. We define X(f) as Y(0) _i Y(r), and write 

Y(f) = Y(0)X(r); (9) 

clearly, X(0) — L so formula ( 8 )^ is applicable* Taking the determinant of (9), we get 
by the product rule for determinants that 

det Y(t) ^ det F(0) dctX(f). (9)' 

Setting (9) and (9 / into (8/, we get 

[detY ⑼广 gdet Y ⑺ |, =0 = tr[Y-_(0 刻 ] ■ 

at 
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We can rewrite ibis as 


^iogdeiY(0U- 

Since now there is nothing special about i « 0, this rebtion holds for all /: 

Theorem 4* Lot Y(/) be a diffcrcmiablc square malrix-valucd funclion* Then 
for those values of t fbr which Y(/) is invertible, 

4logdetY = trfY^4 Y y ⑽ 


The importance of this resull lies in ihe conneclion it establishes between 
determinant and trace* 

So far we have defined j{A) for matrix arguments when / is a polynomial. We 
show qow an example of a nonpolynomial / for which 只 A) can be defined- We Lake 
f(s) = e\ defined by the Taylor series 


CSC' 




S 


(") 


We claim that the Taylor series also serves io define e A for any square matrix A: 


e 


A 




The proof of convergence is ihe same as in the scalar case; il boils down io showing 
that the difference of the partial sums tends lo zero. That is f denote by e m (A) the tmh 
partial sum: 


then 


^m(A) = ^ 


0 


A! 


^m(A) — ^/(A) 


A k 


( 12 ) 


㈤) 


Using the multiplicative and additive rnequalides for the norm of matrices developed 
in Chapter 7, Theorem 14, we deduce lhat 


I a I* 

.I ^ * 
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We are now back in the scalar case, and therefore can estimate the right-hand side 
and assert that as / and m tend to infinity, the right-hand side of (13) tends to zero, 
uniformly for all matrices whose norm ||A|| h less than any preassigned constant* 
The matrix exponential f unclion has some but iigL all properties ot the scalar 
exponential function. 

Theorem 5* (a) If A and B are commuting square malrices, 

，十 B = 

(b) If A and B do not commute、ihcn in general 

€ A+b # e A e B . 

(c) If A(f) depends differentiably on f, so does 

(d) If for a particular value of u Mt) and A(f) commute, then {d/dt)e h = e A A, 

(e) If A is anti-self-adjoint, A* = —A，then e A is unitary. 

Proof, Part (a) follows from ihe deliniiion UI)’ of after (A+B) 1 is 

expressed as ^ 7 t valid for commufing variables. 

That commuJuliviLy is used essentially in Ihe proof of part (a) makes part (b) 
plausible. We shall not make the statcmenl more precise; we contenl ourselves with 
giving a single example: 

B = 

li is easy to see that A 2 ^ 0, B 2 = 0 T so by definilion (11/, 

<?a=i+a= (o !)^ ’ =I+B= (I ?)• 

A brief calcuhition shows that 

# = (?;)，^=(； >)； 

since these products are different, at least one must differ from e A+E : actually, 
both do. 




e A+B 



Exercise 3 . Calculate 


0 
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To prove (c) we rely on the following matrix analogue of an important property of 
differentiation: Let {E^(f)} be a sequence of differentiahle matrix-valued functions 
defined on an interval, with these properties: 

(i) E m (r) converges uniformly to a limit function E(f), 

(it) The derivatives E m (/) converge uniformly to a limit function F(r), 

Conclusion: E is differentiable, and E — F- 

Exercise 4 , Prove the proposition staled in the Conclusion. 

We apply the same principle to E^(/) = e w (A(/)), We have already shown that 
E m (t) tends uniformly to tr 刪 ； a similar argument shows that E ffl (f) converges. 

Exbrcish 5 * Carry out the details of the argument that E /Jt (r) converges. 

Part (d) of Theorem 5 follows from the explicit formula for {cl/di)e A ^K oblained 
by differentiating the series (II)’ termwise. 

To prove part (e) we start with the definition (1 l) f of e A . Since forming the adjoint 
is iL linear imd condnuoys operation, we can take the adjoint of the infinite series in 
( 11 / term by term: 

It follows, using part (a)* that 

— ^ — I 

According to formula (45) of Chapter 7. this shows lhal e A is unitary. O 

Exercise 6 , Apply formula (10) to Y(f) = - and show that 

det e A — ^A. 

Exercise 7 . Prove that all eigenvalues of are of the form a an eigenvalue 
of A. Hint: Use Theorem 4 of Chapter 6 , along with Theorem 6 below. 

We remind llie reader thal fbr self-adjoint matrices H we have already in 
Chapter 8 defined ytH) for a broad class of functions; see formula (33)\ 


2. SIMPLE EIGENVALUES OF A MATRIX 


In this section we shall study the manner in which the eigenvalues of a matrix 
depend on the matrix. Wc take the field of scalars to be C* 
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Theorem 6, The eigenvalues depend continuously on the matrix in (he 
following sense: If {A m } is a convergent sequence of square matrices, in ihe sense 
that all entries of A, rf converge to Ihe coiresponding entry of A, Ihen the set of 
eigenvalues of A tlt converges to the stit of eigenvalues of A- Thai is, for every e > 0 
there is a k such that all eigenvalues of are, for m > contained in discs of radius 
e centered at the eigenvalues of A* 


Proof, The eigenvalues of A m are the roots of ihe characteristic polynomial 
p m (s) = det{rf - A m ). Since A m tends to A, all emries of A m tend to the 
corresponding entries of A: from this it follows that the coefficients of p i}l tend u> 
the coefficients of p. Since the roots of polynomials depend continuously on the 
coefficients. Theorem 6 follows* □ 


Next we investigate the differentiability of the dependence of the eigenvalues on 
the matrix. There are several ways of formulating such a result, for example, m ihe 
following theorem. 


Theorem 7* Let A ⑴ be a differentiable square matrix-valued function of the 
real variable /. Suppose that A(0) has an eigenvalue ao of multiplicity one, in ihe 
sense that ^ is a simple root ol 7 the characteristic polynomial of A(0). Then for t 
通 all enough, A(/) has an eigenvalue 以⑴ that depends differenliably on /, and which 
equals a^ } at zero, that «( 0 ) — a (卜 

Proof. The characteristic polynomial of A(/) is 

det(sl - A(i)) ^p(s, /), 


a polynomial of degree in a whose coefficients are differetuiabie functions of f* The 
assumption Lhat is d simple root of A(0) means that 

*jl 

p{a^ 0) = 0., —0)| 戶 * 一 0. 

According to the implicit function ihcorcm, under these conditions the equation 
r) = 0 has a solution s = a(t) in a neighborhood of / — 0 that depends 
differenliably on f, □ 

Nexi we show ilm under ihe same condilions as in Theorem 7, ihe eigenvector 
pertaining to the eigenvalue a(t) can be chosen to depend differentiably on t. We say 
u can be chosen'" because an eigenvector is delennined only up to a scalur factor; by 
inserting a scalar factor k{t) that is a nondifTerentiable function of t we could，with 
malice aforethought, spoil differentiability (and even ccntinuity). 

Theorem 8, Let A(/) be a differentiable matrix-valued function of /, a(t) an 
eigenvalue of A(/) of multiplicity one. Then we can choose an eigenvector h{t) of 
A(f) pertaining it> the eigenvalue a(t) to depend differentiably on /. 
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Pnmf- We need the following lemma, □ 

Lemma 9* L€t A be an " x matrix, p its characteristic polynomial, a some 
simple root of p. Then at least one ol the (/? — 1) x (/i — 1) principal minors of 
A — a\ has nonzero determinant, where the ith principal minor is the matrix 
remaining when the ith row and ith column of A are removed* 

Proof, We may，ai the cost of subiracting ^1 from A ， take the eigenvalue to be 
zero. The condition that 0 is a simple rooi of pUd means lhat p(0) = 0; 
(dp/ds)(0) ^ 0* To compute the derivative of p wc denote by Ci,.,. t c n the 
columns of A, \md by f * ”心 ihe unit vectors. Then 

~ A = (sei — rj 7 se2 — €2 7 . ^ ^se n — c n ). 

Now wc use formula ( 8 ) for the derivative of a dcierminant: 

(0) = — dct(^I — A)| J= o 

=det(i?i, 一 (2,… ， -c n ) + …+ det{— - q ，... 

Using Lemma 2 of Chupter 5 for the determinants on the right-hand side we see that 
(dpfds) (0) is (― I 广 1 times ihe sum of the determinants of the (u - I} x (n - I) 
principal minors. Since (dj7/ds}(0) ^ 0 , at Icasl one of Ihe delerminants of ihesc 
principal minors is nonzero. □ 

Lei A be ;i matrix as in Lemma 9 and take ihe eigenvalue a to be zero. Then one of 
the principal (" — 1) x — l) minors of A，say the ith, has nonzero determinant* 
We claim that the ith component of an eigenvector h of A pertaining to the 
eigenvalues is non/oro. Suppose it were denote by h ⑷ the vector obtained from h by 
oniitling the ith component, and by Aji the ith principal minor of A. Then h 
satisfies 


A"A W = 0 - (14) 

Since An has determinant not equal io0 f is, according to Theorem 5 of Chapter 5 , 
invertible. Bui then according to (14) ， /r lf ) = 0. If the hh component were zero, that 
would iruike h — 0, a contradiction, since an eigenvector is not equal to 0 , Having 
shown that the rth component of/* is not equal to 0 , wc set it equal to 1 as a way of 
normalizing h. For the remaining components we have now an inhomogeneous 
system of equations: 




Si) 




where c ⑴ is —1 [ime^ the /th column of A, with the ilh component removed. So 


/7 W = A" V f> . 


( 15 ) 
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The matrix A(0) and the eigenvalue a(0) of Theorem 8 satisfy the hypothesis of 
Lemma 9. Then a matrix A&(0) is invertible; since A(/) depends continuously on t, it 
follows from Theorem 6 that A"(f) — is invertible for t small; for such small 
values d( t wc set the ilh component of h(t) equal to K and dettitminc the rosl of h by 
formula (15): 


_=A 。】 (〜蚱 （ 16 ) 

Since all terms on the right depend different!ably on r, so does /〆(，)，This concludes 
the proof of Theorem 8, □ 

Wc now extend Lem mu 9 to the case when the characterislic polynomial has 
multiple roots und prove the following results* 

Lemma 1©, Let A be an ;j x n matrix, p its characteristic polynomial* L^t a be 
some root of p of multiplicity k. Then the nullsptice of (A — al) is at most t- 
dimensional. 

Proof. We may，without loss of generality, take a = 0, That 0 is a root of 
multiplicity k means that 

P( 0 ) = ■•* = ^T〆 0 ) = 0 , —P ⑼关 0 . 

Proceeding as in the proof of Lemma 9 t that is, differentiating k times det(,Tl — A}, 
we can express the kih derivative of p a! 0 as ci sum of delerminanls of principal 
minors of order (ii — k) x (/i — k). Since the kth derivative is not equal to 0, ix 
follows that at least one of these determinants is nonzero, say the minor obtained by 
removing from A the /th rows and columns, i 一 垂 5 , L Denote this minor as 

We claim that the nullspace N of A contains no vector oiher than zero whose first k 
components are all zero* For, suppose h is such a vector; denote by h ⑽ the vector 
obtained from h by removing the first k components. Since A/r — 0, this shortened 
vector satisfies the equation 


A {k) h (k) = 0 (17) 

Since det A ⑻一 0, A ㈤ is invertible; therefore it follows from (17) that = 0, 
Since tlic components that were removed arc zero, it follows that h — 0 t a 
conlnidiction* 

If follows now that dim N < k: for, if the dimension of N were greater than k, if 
would follow from Corollary A of Theorem 1 in Chapter 3 that the k linear 
conditions h[ = 0,.... — 0 are satisfied by some nonzero vector h in N. Having 

just shown that no nonzero vector h in satisfies these condilions, we conclude that 

d\mlV<k. □ 


Lemma 10 can be used to prove Theorem 11, announced in Chapter 6. 
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Theorem 11, Let A be ao n x n matrix, p its characrerisitc polynomial, a some 
root of/; of multiplicity k. The dimension of the space of generalized eigenvectors of 
A pertaining to tlie eigenvalue a is k. 

Proof. We saw in Chapter 6 that the space of generalized eigenvectors is the 
nullspace of (A — al)"，where d is the index of the eigenvalue o. We take a = 0, The 
characteristic polynomial pj of can be expressed in terms of the characteristic 
polynomial p of A as follows: 


d— I 

,vl - A f/ = Y[ (5, 叫 J - 
o 


where m is a primitive Jth root of unity. Taking determinants and using the 
multiplicaiive property of deiermin^ms we gel 


f) 成 s) = det(^I — A^) = det{5 J, — o)^A) 

o 

d-\ 

= 士 J] dct{a}^s l/d ] - A) ^ p(aT J s 1/d ). 
0 0 

Since a — 0 is a root of p of myliiplicily k t it follows that 

P(s) 〜 const,/ 


(18) 


as s tends to zero. It follows from (18) that as s tends to zero, 

Pd(s) ^ const* 


therefore p f / also has a rool of myltiplicily k al 0, ll follows then from Lemma 10 that 
the nul[space of is at most k dimensional. 

To show that equality holds, we argue as follows. Denote the roots of p as 
and iheir multiplicities m k] ” * *Since p is a polynomial of degree n, 
according to the fundamental theorem of algebra. 


— n. (19) 

Denote by Ni the space of genenilizcd eigenvectors of A pertaining lo ihe eigenvalue 
“i. According to Theorem 7, the spectral theorem, of Chapter 6, every vector can be 
decomposed as a sum of generalized eigenvectors: 0* = iV】® … ® 斗 It follows 
that 


n = ^ dim 


(20) 
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Nt is the nullspace of (A - ^I) 4 ; we have already shown that 

dim Ni < ki ， (21) 

Setting this into (20), we obtain 

» < 

Comparing this wilh (19), we conclude that in all inequalities (21) the sign of 
equality holds. □ 

We show next how to actually calculate the derivative of the eigenvalue a(t) and 
the eigenvector h(t) of a nmlrix funclion A(/) when a{t) is a simple root of the 
characteristic polynomial of A(，)、We start with the eigenvector equation 

Ah = ah. (22) 

We have seen in Chapter 5 that the transpose A ; of a matrix A has the same 
determinant as A- It follows tlial A and A r li^vc the same churaclcrisiic polynomial. 
Therefore if a is an eigenvalue of A, it is also an eigenvalue of A y : 

A r / - aL (22/ 

Since a is a simple root of the characteristic polynomial of A 1 , by Theorem 11 the 
space of eigenvectors satisfying (22)’ is one dimensionuL and there are no 
generalized eigenvectors* 

Now differentiate (22) with respett to t: 

Ah + Ah - ah + ah, p3) 

Let 1 act un (23): 

{l. Ah) + (/, AA) = a{U h) + a(l, it). (23)’ 

We use now the deiinition ot the transpose, equation (9) ol Chapter 3. to rewrite ihe 
second term on the left as (A ; !Ji), Using equation {22/, we can rewrite this as 
a(i' /i), the same as the second term on the right; after cancellation we are left with 

(LAh)^a(I.h). (24) 

Wc claim that (/，/!) — 0, so tliat (24) can be used to determine a. Suppose on the 
contrary that (/, ft) = 0: we claim that then the equation 


(A 7 — = i 


( 25 ) 
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would have a solution m. To see this we appeal to Theorem 2 f of Chapter 3, 
according to which the range of T — A r - al consists of those vectors which arc 
annihilated by the vectors in the null space of T 7 — A - al. These are the 
eigenvectors of A [md are mulilples of h. Therefore if (/. h) « 0, / would satisfy 
the criterion of belonging to the range of A 7 — al, and equation (25) would have a 
solution 川甲 This m would be a generalized eigenvector of A’，contrary to the fact 
that there aren’l any. 

Having determined 々 from equation (24). we determine h from equation (23 )， 
which we rearrange as 


(A - (d)h = (a - A)h. (26) 

Appealing once more to Theorem 2 f of Chapter 3 we note that (26) has a solution h if 
the right-hand side is annihilated by the nullspace of A r — aL That trull space 
consists of multiples of /, and equation (24) is precisely the requirement that it 
annihilate the right-hand side of (26). Note that equation (26) does not determine // 
uniquely, only up to a multiple of h. That is as It should be, since the eigenvectors 
h(t) are determined only up to a scalar factor that can be taken as an arbitrary 
differentiable function tif f. 


3, MULTIPLE EIGENVALUES 


We are now ready to treat multiple eigenvalues* The occurence of generalized 
eigenvectors is hard to avoid for general matrices and even harder to analyze. For 
this reason wc shall discuss only self-adjoint matrices, because they have no 
generalized eigenvectors. Even in the self-adjoint case we need additional 
assumptions to be able to conclude that Ihe eigenvectors of A depend continuously 
on a parameter i when A(/) h a diiTeremiable function of t. Here is a simple 2x2 
example: 


A = 

b % c ， d funclions of f, so thaL c(0) — (X b{0) — d{0) — L That makes A(0) * I t 
which has 1 as double eigenvalue* 

The eigenvalues a of A are the roots of its characteristics polynomial. 



A + rf 土 sJib-df+Ac 1 


Denote the eigenvector h as (^) * The first component of the eigenvalue equation 
AA — ah is bx + cj — ax, from which 


y a — h 

'W 

x c 
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Using the abbreviation {d - b)/c = we can express 

V a — h k + \/k 2 + 4 

- == - -^- * 

X € 2 

Wc choose fe(/) = sm(l _l ) ? c{f) = exp(—and set b m l,d = l + ck. Clearly 
the entries of A(f) are functions yet y/x is diseontinLious as / —(). 

Theorem 12 describes im addiliona] condiLion under which the eigenvectors vary 
continuously, To arrive al these conditions wc shall reverse the procedure employed 
for matrices with simple eigenvalues: we shall first compute the derivatives of 
eigenvalues and eigenvectors and prove afterwards that they ure differentiable under 
the additional condition. 


Let A(0 be a differentiable function of the real variable whose values are 
sell adjoint malrices. A" = A, Suppose that at / = 0, A(0) has as eigenvalue of 
multiplicity k > 1, that is, h a A-fold root of the characteristic equation of A(0). 
According to Theorem 1 K the dimension of the generalized eigenspace of A(0) 
pertaining to the eigenvalue is Jt. Since A(0) is self-adjoint, it has no generalized 
eigenvectors; so the eigenvectors A(0)/j = aoh form a ^-dimensional space which 
we denote as M 

Wc take now eigenvectors h(r) and eigenvalues a(t) of A(/) s fl(0) = presumed 
to depend differentiably on L Then the derivatives of h and a satisfy equalion (2,1); 
set / — 0: 


Ah + Ah — ah + ah. (27) 

We recall now from Chapter 8 the projeclion operators entering the spectral 
resolution; see equations (29), (30), (31)，and (32). We denote by P the orthogonal 
projection onto the eigenspace of A with eigenvalue a — Since the 
eigenvectors of A arc orthogonal, it follows [see equations (29M32)] that 


PA = aP. (28) 

Furthermore, eigenvectors h in N satisfy 

P/i = h. (28)' 

Now apply P to both sides of (27): 

P 入八 + PAh = iiPh + aPL 
Using (28) and f28)\ wc get 

Pkph + aPh = tth + aPL 
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The second terms on the righl- and left-hand sides are equal, so after cancellation 
we get 

PAPh = ah. (29) 

Since A(f) is self-adjoint, so is A; and since P is self-adjoint, so is PAR Clearly, PAP 
maps N into itself; equation (29) says that «(()) must be one of the eigenvalues of 
PAP on anti /i(0) must be an eigenvector. 

Theorem 12. Let A(/) be a differentiable funclion of the real variable / whose 
values arc self-adjoint matrices. Suppose lliat [u f 二 0, A(0) has an eigenvalue of 
multiplicity k > L Denote by N the eigenspace of A(0) with eigenvalue and by P 
the orthogonal projection onto N. Assume that ihc self-adjoint mapping PA(0)P of 
into N has k distinct eigenvalues dj, i — Denote by wi corresponding 

normalized eigenveciors. Then for t small enough, A(f) has k eigenvalues 
aj(t)J = 1 ■ ■", 又％ near ao, with the following properties: 

(1) 卬 (，) depend differentiably on ( and tend to a(i as 《一 0. 

(ii) For / ^ 0, the ajij) arc distinct. 

(iii) The corresponding eigenvector hj(i): 

mW - aj(t)hj(tl (30) 

can be so normalised that hj(t) lends to wj as t 0, 

Pmof. Far t small enough the characteristic palynumial of A(0 differs little from 
that of A(0). By hypothesis, ihe letter has a k-fo\d rooi at n(i ； it follows that the 
Ibrmer have exactly k roots that approach <h) as i 一 0, These roots are the 
eigenvalues aj(t) of A ⑺， According to Theorem 4 of Chapter 8, the corrcsponing 
eigenvectors ly(t) can be chosen to fonn an orthonormal set. □ 

Lemiiia 13. As f —^ 0 、 the distance of each of the normalized eigenvectors hj(i) 
from the eigenspace N tends to zero* 

Proof, Using ihe orthogonal projection P onto N, we can retbnnulate the 
conclusion as follows: 

lim||(I-P)~(f)| 卜 0, X …上 (31) 

f—o 

To show this, we use the fact that as r 0, A(f) — A(0) and a』(/) — ag ； since 
|[/^(f)|| = 1, we deduce from equation (30) that 




( 32 ) 
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where e(/) denotes a vector function that tends to zero as r —^ 0. Since N consists of 
eigenvectors of A(0), and P projects any vector onto N. 

A(0)P/ ?/ {r) =« 0 P/J/(0- (32/ 

We subtract (32 / from (32) and get 

A(())(I - P)hj(t) - a Q (l - P)hj(t) + e([). (33) 

Now suppose (31) were false; ihen ihere would be a posilive number d and a 
sequence of t 0 such that ||{I - P)/ij(r)j| > cl. We have shown in Chapter 7 that 
there is a subsequence of l for which (I - P)/ij(f) tends to a limit hi this limit has 
norm >d. It follows frnni (33) that this limit satisfies 


A(0)/i = a 也 (33)' 

This shows lhal h belongs to the eigenspace 

On the other hand, each of the vectors (I — P)/j ; (?) is orthogonal to jV; therefore so is 
their limit /k But since N contains ft, we have arrived at a contradiction* Therefore 
(31) is true, □ 

We proceed now to prove the continuity of hj(t) and the differentiability of «/{i)* 
Subtract (32) # from (30) and divide by /; after the usual Lcihniz-ish rearrangement 
wc get 

聲 __ + = 吣 H %) 4 - 


We have dropped the subscript j to avoid clulten We apply P to boih sides; according 
to relation (28) PA(0) = aP r Since = P wc see that the second terms on the two 
sides are zero. So we get 


P A ⑴—綱 甽 


a{t) - a(0) 


PKt) 


(34) 


Since A was assumed to be differentiable, 


A ⑺一 A(0) 


A(0) + e(f); 


and by (31) ， h(t) = P/i(f) + e(f). Setting these iiUo (34) we get，using P 2 = P, that 


PA(0)P Ph(t) = a ( f ) _ 綱 帅 ) + £ ( r ). (35) 
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By assumption, the self-adjoint mapping PA(0)P has k distinct eigenvalues on N, 
with corresponding eigenvectors 叫； 

PA(0)Puv = diWi ， i — I ? 

We expand P/j(f) io terms of these eigenvectors: 

Ph(t) = 瑪， （ 36) 

where X[ are functions uf f, und set it inlo (35): 


Since the {u^} form an orthonormal basis for vve can express the norm of the left- 
hand side of (36) in lerms of components: 


|_|| 2 = 


According to (31), ||P/j(r) - /j(/)|j lends to zero. Since ||A(/)j| a = 1. we deduce that 


i|p^)ii 2 = EMol'- i-KO, (37) 

where e{0 denotes a scalar function (hat tends to zero. We deduce from (35)’ that 




a(0 - 咐 ) 

t 


k,WI 2 = # 


(37/ 


Combining (37) and (37 / wc deduce that for each t small enough there is an index j 
such that 


(i) 

(ii) 

(iii) 


a(t) - a(Q) 


ki(OI < ^{0 

[ a }{/)| = 1 - f(t). 


< 

fori 


(38) 


Since 功） are cominuous functions of / for / ^ 0, ii follows from (38) lhat the index 
j is independent of t for t small enough* 

The norrmilizatioo ||/t(/)|| = l of the eigenvectors still leaves open a factor of 
absolute value 1; we choose this factor so that not only \xj\ but xj itself is near !: 


= \ _ c(t). 


(38/ 
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Now we can combine (31) ，（ 36) ，（ 38) ⑽， and (38)' lo conclude that 

\\h(t)-wj\\<€(i). (39) 

Wc recall now that the eigenvector /K0 itself was one of a set of k oithonornial 
eigenvectors. We claim that distinct eigenvectors hj(t) are assigned to distinct 
veciors uy; for, clearly two orthogonal unit vectors cannoi both differ by less than e 
from the same vector w；. 

Inequality (39) shows that Ay(i), properly normalized, tends lo t 0, 
Inequality (38) (iJ shows that aj(t) is dilferentitible a" = 0 and ikit its derivalive is 办 
It follows that for f small but not equal to 0, A(/) has simple eigenvalues near fio- This 
coocludes the proof of Theorem L2, □ 


4. ANALYTIC MATRIX-VALUED FUNCTIONS 


There are further results about differenliability of eigenvectors, the existence of 
higher derivalives, but since these are even more tedious than Theorem 12 we shall 
nol pursue them, excopl for one observation, due to Rellich. Suppose A{f) is an 
analytic function of t: 

00 

A(t) = ( 40 ) 

where each A/ is a self-adjoint matrix. Then also the characteristic polynomial of 
A(f) is analytic in r The characteristic equation 

p(s, r) = 0 

defines^ as a function of Near a value of / where the roots of/? are simple, the roots 
rt(/) are regular analytic functions of /; near a multiple root ihe rools have an 
algebraic singularity and can be expressed as power series in a fractional power off: 

a(t) - ^ r ,/ 气 （ 40/ 

o 


On (he other hand, we know from Theorem 4 of Chapter 8 that for real the matrix 
A{t) is self-adjoint and therefure all its eigenvalues are real. Since fractional powers 
of / have complex values fo『real t, wc can deduce that in (40/ only integer powers of 
t occur, that is, that the eigenvalues aft) are regular analytic functions of t. 


5. AVOIDANCE OF CROSSING 

The discussion at the end of this chapter indicates that multiple eigenvalues of a 
matrix function A ⑺ have to be handled with care, even when the values of the 
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function are self-adjoint matrices* This brings up the question. How likely is it that 
A(;) will have multiple eigenvalues for some values of /? The answer is, 4 'Nol very 
likely ”； before making this precise, we describe a numeric^] experiment. 

Choose a value of h, and ihen pick at random iwo real, symmetric n x n matrices 
B and M, Define A(/) to he 


A{t) = B + iM. 


(41) 


Calculate numerically ih^ eigenvalues of A(f) ai a sufficiently dense set of values of 
/, The following behavior emerges: as t approaches certain values of f, a pair of 
adjacent eigenvalues a[{t) and 112 (f) appear to be on :i collision course; yet at the last 
minute they turn aside: 



This phenomenon, called avoidance of emssing, was discovered by physicists in the 
early days of quantum mechanics. The explanation of avoidance of crossing was 
given by Wigner and von Neumann; it hinges on the size of the set of real, symmelric 
matrices which have multiple eigetivalues, called degenerate in the physics 
liteniture* 

The set of ail real, symmelric n x n matrices forms a linear space of dimension 
N = n{n + 1)/2. There is another way of pammetrizing these matrices, namely by 
their eigenvectors and eigenvalues. We recall from Chapter 8 that the eigenvalues are 
real, and in case they are distinct, the eigenvectors are orthogonal; we shall choose 
them to have length L The first eigenvector, corresponding to the largest eigen value T 
depends on n - l parameters; the second one, constrained to be onhogonal to the 
first eigenvector, depends onn - 2 parameters, and so on, all the way to rhe{« — l)si 
eigenvector thuit depends on one parameter The last eigcnvcclor is ihcn determined, 
up to a facior plus or minus h The iota I number of these parameters is 
{n — I) + (fl — 2) + … + 1 = n{n - J)/2; to these we add the n eigenvalues, for 
a total of n{n — 1)/2 + « = n(n + 1)/2 ^ N parameters, as before, 

Wc turn now to the degenerate matrices, which have two equal cigcnvalocSj the 
rest distinct from it and each other The first eigenvector, corresponding to the largest 
of the simple eigenvalues, depends on n I parameters, the next one on " — 2 
pamrncters, and so on, all the way down to the last simple eigenvector that depends 
on two piirameters. The remaining eigenspace is then uniquely determined. The total 
number of these parameters is (n - I) + * ■. + 2 = (n(n - 1))/2 - 1: to ihene we 
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add the n - I distinct eigenvalues, for a total of [n{n - 1))/2) -!+«-! = 
(n(n + 1))/2) - 2 二 W — 2* 

This explains the avoidance of crossing: a line or curve lying in A^-dimensional 
space will in general avoid intersecting a surface depending on — 2 paramclers- 

Exercise 8, (a) Show that the set of all complex, self-adjoint x n matrices 

forms N — » 2 -dimensional linear ^pace over the reals, 

(b) Show that ihe sei of complex, self-adjoint n x n matrices that have one double 
and n - 2 simple eigenvalues can be described in terms of /V - 3 real parameters. 

Exhrc ise 9, Choose in (41) at random two self-adjoint 10 x 10 matrices M and 
B* Using available software (MATLAB, MAPLE, etc,} calculate and graph at 
suitable intervals the 10 eigenvalues of B + as functions of / over sonic 
r-segment. 

The graph of ihe eigenvalues of such a one-parameter family of 12x12 
self-adjoint matrices ornanients the cover of this volume; they were computed 
by David Muraki, 



CHAPTER l 0 


Matrix Inequalities 


In this chapter we study self-adjoint mappings of a Euclidean space into itself that 
are positive. In Section 1 we state and prove the basic properties of positive 
mappings and properties of the relation A < B. In Section 2 wc derive some 
inequalities for the determinant of positive matrices. In Section 3 we study the 
dependence of the eigenvalues oil Ihe matrix in light of the partial order A < In 
Section 4 wc show how to decompose arbitrary mappings of Euclidean space into 
ilself as a product of self-adjoint and unilary maps. 


L POSITIVITY 

We recall Irom Chapter 8 the definition of a positive mapping: 

Defimtion. A self-adjoint linear mapping H from a real or complex Euclidean 
space into itself is called positive if 

(x, Hx) > 0 for all / 0. f I) 

Positivity of H is denoted as H > O or O < H. 

Wc cal! a self-adjoint map K nonnesorive if the associated quadratic form is 

(x. Kj:) > 0 Ibr all i (2) 

Nonnegadvity of K is denoted as K > 0 or O < K, 

The basic properties of positive maps are contained in the following theorem. 
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Theorem 1 

{i) The identity I is positive. 

(ii) If M and N arc positive* so is their sum M + N，as well as for any 
positive number a. 

(iii) If H is positive and Q is invertible, then 

Q*HQ > O. (3) 

(iv) H is positive iff all its eigenvalues are positive. 

(v) Every positive mapping is invertible. 

(vi) Every positive mapping has a positive square root，uniquely determined, 

(vii) The .set of till positive maps is an open subset of the space of all self- 
adjoint maps, 

(viii) The boundary points of the set of all positive maps are nonnegative maps 
that arc not positive* 

Proof. Part (i) is a consequence of the positivity of the scalar product; part (ii) is 
obvious. For part (iii) we write the quadratic form associated with Q HQ as 

(x ， O l HQ,v) = {Qx,MQx) = (j, Hj-), (吖 

where y — Qx. Since Q is invertible, \ f x ^ 0, y / 0, and so by (1) the right-hand side 
of (3)' is positive. 

To prove (iv), let h be an eigenvector of H, a the eigenvalue H/i = ah. Tuking the 
scalar prod yet with h we get 

(/r.H/i) = a(h ， h)' 

clearly* this is positive only if a > 0. This shows that the eigenvalues of a positive 
mapping are positive. 

To show the converse, we appeal to Theorem 4 of Chapter 8, according to which 
every sdf.adjoint mapping H has an orthonormal basis of eigenvectors. Denote these 
by hj and the corresponding eigenvalues by Oj\ 

Hhj = ajhj. (4) 

Any vector x can be expressed m u linear combin*ition of the fy: 

x = ⑷' 

Since the //； are eigenfunctions, 

_ = ⑷" 
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Since the hj form an orthonormal basis, 

(xa) = (x ， Rv) = Z aj\xj\ 2 . (5) 

U follows from (5) that if all aj are positive, H is positive. 

We deduce from {5} the following sharpening of inequality (1): for a positive 
mapping 

(x, H.x) > a\\ x ]| 2 , for all x, (5)^ 

where a is the smallest eigenvalue of H, 

(v) Every noninveitible map has a nullvector，which is an eigenvector with 
eigenvalue zero. Since by (iv) a positive H has all positive eigenvalues, H is invertible, 

(vi) We use Ihe existence of an orthunonnal basis formed by eigenvectors of H, H 
posilive. With x expanded as in ⑷、 we define \/H by 

(6) 

where ^aj denotes the positive .square root of aj. Comparing this with the expansion 
(4)^ of H itself we can verify that (v^H) 2 = H. Clearly, \/H as defined by (6) has 
positive eigenvalues, and so by (iv) is positive, 

(vii) Let H he any positive mapping，and N any self-adjoint mapping whose 
distance from H is less than a, 

I! N ^ H II < n, 

where a h the smallest eigenvalue of H. We claim that N is invertible. Denote N — H 
by M; the assumption is that || M ||< a. This means that for all nonzero jc io X, 

|| Mr || <a\\x ||. 

By the Schwarz inequality, for x ^ 0, 

Using this and (5/, we see that for x ^ 0, 

(x,Nx) = (x t (H + M)x) = (x, Hx) + (X ， Mx) > a |[ x || 2 - a\\ x \\ 2 = 0. 

This shows that H + M = N is posiiive. 

(viii) By definiiion of boundary, every mapping K on the boundary is the limit of 
mappings H ；I > 0: 


lim H n = K, 
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It follows from the Schwarz inequality that for every x, 

lim {jc, H n x) ^ (a\ Ka), 

Since each is positive, and the limit of positive numbers is nonncgalivc T it follows 
that K > 0* K cannot be positive, for then by pari (vii) il would not be on Ihe 
boundary, □ 

Exercise r. How many square roots are there of a positive mapping? 

Characterizations analogous to parts of Theorem I hold for nonncgativc 
mappings: 

Exhricse 2 . Formulate and prove properties of nonnegalivc mappings similar 
to parts (i), (ii), (iii), (iv), and (vi) of Theorem l. 

Based on the notion of positivity we can define a partial order among self-adjoint 
mappings of a given Euclidean space imo itself* 

Definition. Lei M and N be two se I fad joint mappings of ii Euclidccin space into 
itself. We say that M is less thun N, denoted as 

M < N or N > M. (7) 

if. N - M is positive: 

O < N-M. (7)' 

The relation M < N is defined analogously. 

The following properties arc easy consequences of Theorem K 

Additive Property. If Mi < N| and < Nj then 

Mi +M 2 < Ni + N 2 , (8) 

Transitive Properly. If L < M and M < N, then L < N. 

Multiplicative Property. If M < N and Q is invertible, ihcn 

Q 4 MQ < Q‘NQ+ (9) 

The partial ordering defined in (7) and (7) f for self-adjoint maps shares some — 
but not all_other properties of the natural ordering of real numbers. For instance, 
the reciprocal property holds. 

Theorem 2. Let M and N denote positive imippings tha( saiisfy 


O < M < N. 


⑽ 
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Then 


M - 1 > N “’， (H )) 1 

First Proof Wc start with tlic case when N = I. By definition, M < I means that 
I = M is positive. According to part (iv) of Theorem I， that means that the 
eigenvalues oi 1 — M are positive, that is, thai (he eigenvalues o( M are less ihan L 
Since M is positive, the eigenvalues of M lie between 0 and 1. The eigenvalues of 
M _l «re reciprcx:als nf those of M; therefore the eigenvalues of M -1 are greater than 
1. Thai makes the eigenvalues of M _, — [ positive; so by part (iv) of Theorem L 
M _l - 1 is positive, which makes M 一 】 > 1* 

We turn now to any N satisfying (10); according to part (vi) of Theorem 1, we can 
factor N = R : , R > O. According lo part (v) of Theorem L R is invertible; we use 
now property (9) t with Q = R, to deduce from (10) that 

0 < R^MR 1 < R 'NR 1 =L 

From wliat wc have already shown, it follows from the equation that the inverse of 
R^MR _1 is greater than l: 


RM _1 R > L 

We use once more property (9), wiih Q — R~ l , to deduce that 

M _i > R-'IR-' = R - 2 = □ 


Second Proof. Wc shall use the following generally useful calculus lemma. 

Lemma 3. Let A(l) be a differentiable function of the real variable whose 
values arc self-adjoint mappings; the derivative (d/dr}A is then also self-adjoint. 
Suppose that (d/dr)A is positive; then A(/) is an increasing function, that is, 

Ms) < A(t) when s < t. (11) 


Prm 求 Let v be ony nonzero vecior ? independent of /. Then by the assumption 
that the derivative of A is posilive, we obtain 

— (x t Ajt) ~ ^ Jt, — ^ 0 ■ 

So by ordinary calculus, (x A(/)x) is i\t\ increasing function of r: 

(X ， A(s)a*) < (x^ A(f)x) fur s < t. 

This implies that A(i) — A ㈤ > O, which is the meaning nf (11)* □ 
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Let A(r) be as in Lemma 3, arid in addition suppose that A(t) is invertible; we 
claim that A l (i) is a decreasing function of/- To see this wc differemiate A ^ using 
Theorem 2 of Chapter 9: 


d 

di 


A 


A 


A A 

Jt 


We have assumed that JA/ch is positive, so it follows from part (iii) of Theorem 1 
that so is A~ ] (dA/dt)A~ [ . This shows that the derivative of A~ l (t) is negative. It 
follows Ihcn from Lemma 3 that A _1 (i) is decreasing. 

Wc now define 


A(f) = M + f(N-M), 0<t< l. (12) 

Clearly, dA/Jr — N - M, positive by assumption (10). It further follows from 
assumption (10) that for 0 < f < ], 

A(r) = (1 

is the sum of two positive operators and therefore itself positive. By part (v) of 
Theorem 1 wc conclude that A(/) is invcrliblc* We can assert now, as shown above, 
that A(r) is a decreasing function: 

A^(0) > A 、 (l). 

Since A(0) = M, A( 1) = N. this is inequality ( 】 0)'，This concludes the second proof 
of Theorem 2. □ 


The product of two self-adjoint mappings is not, in general, self-adjoint. We 
introduce the symmetrized product S of two self-adjoint mappings A and B as 

S = AB + BA, (13) 

The quadratic form associated with the symmetrized product is 

(ji 、 S.r) = (jt. ABx) + (x ? BA_r) = (A-v. Bx) + (Ba, Ax). (14) 

In the real case 

(x,Sx) ^ 2{Ax, Ba), (14)’ 

This formuhi shows that the symmetrized product of iwo positive mappings need not 
be positive; ihe conditions (x, Aa.) > 0 and (x, Bx) > 0 mean that Lhe pairs of vectors 
x\ Av and x, B.v make an angle less than tt/ 2* Bui these restrictions do not prevent the 
vectors At. Bx fvom making an angle greater than jt/ 2. which would render (] 4/ 
negative. 
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Ekhrcise 3 , Construct two real, positive 2x2 matrices whose symmetrized 
product is not positive. 

In view of the Exercise 3 the following result is somewhat surprising. 

Theorem 4, Let A and B denote two self-adjoint maps with the following 
properties ： 

(i) A is positive, 

(ii) The symmetrized product S = AB + BA is positive. 

Then B is positive* 

Proof. Define B(r) as B(/) = B + fA. We claim lhat for t > 0 the symmelrized 
product of A and 6(^) is positive. For 

S(f) = AB(t) + B{t)A = AB + BA + ItA 1 = S f 2(A l ; 

since S and 2tA 2 are positive, their sum is positive. We further claim that for t large 
ctiougli positive, B(f) is positive* For 

(.v, B(/),r) = (.?, B.t) + t(x 7 A,r); (15) 

A was assumed positive, so by (5/, 

(X ， A.v) > rr|| x || 2 f a > () T 
On the other hand，by the Schwarz inequality 

l(^B,v)l<I|x|||| B.v||< || Billion 2 . 

Polling these inequalities logether with (15), we get 

(^B(i)a-) > (m - [I B ||)|| x (I": 

clearly this shows lhat B{/) Ls positive when ta > \ B |[. 

Since B(f) depends continuously nti t\ if B — B(0) were nol positive, there 
would be some nonnegative value between 0 and || B |i/a, such that B(> 0 ) lies 
on the boundary of the sel of positive mappings. According to part (viii) of 
Theorem 1 ， a mapping on the boundary is nonnegaiive hut oot positive. Such a 
mapping B(/o) has nonnegaiive eigenvalues, at least one of which is zero* 
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So there is a nonzero vector y such that B(i 0 )y = 0, Setting jr = y in (14) with 
B — B(%), wc obiain 

(j ? S(ro),v) = (A)’, B(f () )y) + (B(r fl )j, Av) =0; 

this is contrary to the positivity of S(^j); therefore B is positive. □ 

In Section 4 wc offer a second proof of Theorem 4. 

An interesting consequence of Theorem 4 is the following theorem. 

Theorem 5, Let M and N denote positive mappings that satisfy 


0 < M < N; 

⑽ 

then 


\/M < V^N, 

(16 / 

where denotes the positive square root. 


Proof. Define the function A ⑺ as in (12): 



A ⑺ = M + /(N-M). 

Wc have shown that A(r) is posilivc when 0 < / < 1 : m we can define 


刚 = V ^， 0<f< I, (17) 

where ^ is the positive square root. It is not hard to show that R(l), the square root 
of a differentiable positive function，is differentiable. We square (17 )，obtaining 
R~ = A; differentiating with respect to r ? we obtain 

RR + RR^ A, (18) 

where the dot denotes the derivative with respect to r Recalling the definition (13) of 
symmetrized product we can paraphrase (18) ns follows: The symmetrized product 
of R and is A. 

By hvpoihesis (16), A = N — M is positive: by construction, 容 o is R, Therefore 
using Theorem 4 we conclude that R is positive on the interval 10, 1 ]. Ii follows then 
from Lemma 3 that R(t) is an increasing function of /: in particular, 


R(0) < R(l), 
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Since R(Q) — y^A(0) = \/M, R( I) = \/^0) = inequality (16/ 

follows* □ 

Exercise 4 . Show that if 0 < M < N，then (a) M t/4 < N" 4 . (b) M 1/jM < N " m ， 
m a power of 2, (c) log M < log N, 

Fractional powers and logarithm are defined by the functional calculus in 
Chapter 8 . (Hint: logM — lim m=+oc 川 M 1/m — I],) 

Exercise 5 . Construct a pair of mappings 0 < M < N such that M' is not less 
than N], (Hint ： Use Excricsc 3.) 

There is a common theme in Theorems 2 and 5 and Exercises 4 and 5 that can be 
expressed by the concept of monotone /rtatrix funaion. 

Definition ， A real-valued function /(.v) defined for ^ > 0 is culled a monotone 
matrix function if all pairs of self-adjoint mappings M, N satisfying 

0 < M < N 

also satisfy 

/(M) </(N), 


where/(M), /(N) are defined by the functional calculus of Chapter 8 + 

According to Theorems 2 and 5, arid Exercise 4, the fund ion s/(s) = — 1/s ， s 1 
log 5 are monotone matrix functions. Exercise 5 says f(s) = ^ is not. 

Positive multiples, sums, and limits of monotone matrix funclions are mmfs* 
Thus 


E 




s + tj 


>a 


> 0 


iire mmf's, as is 


f(s) — as + b — 


■oc 


dm(t) 
s + t 


(19) 


where a is positive, b is real, and m{t) is a nonnegative measure for which the 
integral (19) converges* 

Cart Locwncr has proved the following beautiful theorem. 

Theorem* Every monotone matrix function can be written in ihe form (19 )， 
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At first glance, this result seems useless* because how can one recognize lhat a 
function f(s) defined on _ + is oi' form (19)? There is, however, a surprisingly simple 
criterion: 

Every function/ ol form (19) can be extended as an analytic function in the upper 
half-plane, and has a positive imaginary part there, 

HxERcrsE 6 . Verify lhaL (19) defines/(i：} for a complex argument z as an analytic 
funclion, as well as that Im/(z) > 0 (or Imx > 0. 

Conversely, a classical theorem of Hcrgloi/ arid F. Rimz says that every function 
analytic in the upper half-plane whose imaginary part is positive there, and which is 
real on the positive real axis, is of form (19). For a proof, consult the author’s text 
entitled Ftou'lional Analysis. 

The functions =1 /s, 〃f > 1， logs have positive imaginary parts in the upper 
half-plane; the function s 2 does not. 

Having talked so much aboul positive mappings, it is lime to present some 
examples. Below wc describe a method for constructing positive matrices, in fact all 
of them, 

Dejimtion. Let /， "“丄 be an ordered set of vectors in a Euclidean space. The 
matrix G with entries. 


{fiJi) ( 20 ) 

is called the Gram matrix of the set of vectors. 

Theorem 6, (i) Every Gram matrix h nonnegiilive, 

(ii) The Gram matrix of a set of linearly independent vectors is positive. 

(iii) Every positive matrix can be represented as a Gram mairix* 

Proof. The quadratic form associated with a Gram matrix can be expressed as 
follows: 


(x 9 Gx) = ^GijXj -= 

=( [戒 [切) = IE 《11， （ 20 )’ 

Parts (i) and (ii) follow immediately from (20)\ To prove part (iii), let (H 々 ）=H 
be positive. Define for vectors x and y io C w the nonstandard scalar product {. ) H 
defined :is 


^y)u = ( x ^ H .v) 
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where Q is the standard scalar product. The Gram matrix of the unit vectorsy} = is 

( e “ e du = ( e ^ He j) - h ir 口 

Example. Take the Euclidean space lo consist of real-valued functions on the 
interval [0, 11. with the scalar product 


(/，《）= 

Ju 

Choose^ — f l , j — I ” t ” a The associated Gram matrix is 


% 


( 21 ) 


Exercise 7 . Given positive numbers n, ,., f r m , show that ihe matrix 

1 


G// 


n + rj + 1 


( 22 ) 


is positive. 


Example. Take as scalar product 


(f,8)= 厂 /( 輔 ) 邮)祇 

Jo 


where w is some given positive real lunciion. Choose ^ = j — —/r 
associated (2n + 1) x (2n *f I ) Gram matrix is Gjy = c 卜 where 


+ ,ti. The 


c p 


w(d)e-^d0. 


We conclude this section wilh u curious result due to L Schur. 

Theorem 7. Let A = (A 仏 ） and B = (By) denote positive matrices. Then 
M = (M"), whose entries are the products of the entries of A and B, 

M|> = (23) 

also is a positive matrix* 

In Appendix 4 we shall give a one-line proof of Theorem 7 using tensor products. 
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2. THE DETERMINANT OF POSITIVE MATRICES 

Theorem 8. The determiiiiint of every positive matrix is positive. 


Proof. According to Theorem 3 of Chapter 6, the determinant of a matrix is the 
product of its eigenvalues. According to Theorem 1 of this chapter, the eigenvalues 
of a posilive malrix are positive. Then so i$ Iheir product, □ 

Theorem 9. Let A and B denote real, self-adjoint, positive nx n matrices. 
Then for all t between 0 and 1, 

dei(/A + (l -t)B) > (deiA) f (detB) 1 ^. (24) 

Proof r Take the algorithm of both sides. Since log is a monutunic function, wc 
get the equivalent inequality: fbr all / in |0, 1], 

log det(/A + (1 — t)B) > t log det A + (1 一 f) logdet B* (24)’ 

Wc recall the concept of a concave function of a single variable: A function/^) 
is called concave if its graph between two points lies above the chord connecting 
those points. Analytically, this means that for all i in [0 T 1], 


f(m + (\-t)h)>tf{a) + (1 一，) / ⑻， 

Clearly, (24)’ can be interpreted as asserting that the function log det H is concave on 
the set of positive matrices* Nole that ii follows from Theorem 1 that for A and B 
positive, f A + (1 — i)B is positive when 0 < / < 1, According to a criterion we learn 
in calculus, a function whose second derivative is negative is concave. For example, 
the lunction log/，defined lor t positive, has second derivative - 1/r 2 , and so it ik 
concave. To prove (24)’，we shall calculate【lie second derivative of the funciion 
/(/) = log det(/A + (l — /)B) and verify that it is negative* We use formula (10) of 
Theorem 4 in Chapter 9, valid for matrix valued functions Y(r) thal arc 
differentiable and invertible; 


ylogdetY = ti-(Y- ! Y). (25) 

at 


In our case, Y(f) 一 B + f(A - B); its derivalive is y — A B, independent of i. So T 
differentiating (25) with respect to /, we get 

Al 

— log del Y = tr(-Y^YY~ ] Y) = —tr(Y _1 t ) 2 ， (25)' 

d 卜 


Here we have used the product rule, and rules (2)’ and (3) from Chapter 9 
concerning the diffcrcntialion of the trace and the reciprocal of matrix functions. 



MATRIX 1NRQUAUTIRS 


155 


According to Theorem 3 of Chapter 6 , the trace of a matrix is the sum of its 
eigenvalues; and according tn Theorem 4 of Chapter 6 , the eigenvalues of the square 
of a matrix T are the square of the eigenvalues of T. Therefore 

(26) 

where aj are the eigenvalues of Y _1 Y. According to Theorem 1 V in Chapter 8 , the 
eigenvalues aj of ihc product Y _ 1 Y of r d positive matrix Y _1 and a sdf-adjoint matrix 
Y arc real. It follows that (26) is positive; setting this into {25)\ wc conclude that the 
second derivative of logdet Y(f) is negative. □ 

Second Proof, Define C as B ' 1 A; by Theorem 1 1 r of Chapter 8 , the product C of 
two positive matrices has positive eigenvalues cj. Now rewrite the left-hand side of 
(24) as 

deEB(/BdA+ (1 - 1 ) 1 ) 二 detBdet(fC + (l -/)I). 

Divide both sides of (24) by det B; the resulting right-hand side can be rcwritlcn as 

(det A)'(detB )" 1 = (detC)'. 

What is to be shown is that 

det(fC+ (1 - /)I) > (detC) f . 

Expressing the determinants as (he product of eigenvalues gives 

+ 1 - 0 2 n 4 

We claim that for all / between 0 and I each factor on the left is greater than the 
corresponding factor on ihe right: 

Ic + (I * f) > c\ 

This is true because c J is a convex function of t and equality holds when r = 0 or 

1 = 1 * □ 

Next we give a useful estimate for the determinant of a positive matrix. 

Theorem 10* The determinant of a positive matrix H docs not exceed the 
product of its diagonal elements: 

detH < Ylhn. (27) 
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Froof. Since H h positive, so are its diagonal entries. Define d- t — \/y/hih and 
denote by D the diagonal matrix with diagonal entries Define the matrix B by 

B = DHD. 

Clearly, B is symmetric and positive and its diagonal entries arc all 1’、 By the 
moltiplicative property of determinants, 


det B = del H del D 2 = (28) 

1 i m 

So (27) is the same as del B < I. To show this, denote the eigenvalues of B by 
b '， … positive quantities since B is a positive matrix. By the arirlimetic- 
geometric mean inequality 

- ([¥)' 

We can rewrite this as 

det B< (29) 

Since the diagonal emries of B are all I's, trB = n f so det B < 1 follows. □ 


Theorem 10 has this consequence* 


Theorem 1L Let T be any n x n matrix whose columns arc … ， c«. Then 
the determinant of T is in absolulc value not greater than the product of the length of 
ils columns: 


|detT| <l[\\cj |l. (30) 

Proof. Define H = T'T; its diagonal elements are 

‘ ： ^ 2 如 ： H ㈣ G IP 

j J I 

According to Theorem 1, T T is positive, except when T is noninvertible, in which 
case detT = 0, so there is nothing to prove. We appeal now to Theorem 10 and 
deduce that 


det H < || Ci |p. 
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Since the determinant is multiplicative, and since detT* = detT ， 

detH = detT + detT- |detT | 气 

Combining the last iwo and taking its; square root we obtitin inequality (30) of 
Theorem 11. □ 

Inequality (30) is due to Hadamard and is useful in applications. In the real case it 
has an obvious geometrical meaning: among all parallelepipeds with given side 
lengths || cj ||, the one with the largest volume is rectangular. 

We return to Theorem 9 about determinants; the first proof wc gave for it used the 
differential calculus. Wc preseni now a proof based on integral calculus. This proof 
works for real, symmetric matrices; it is based on an integral formula for the 
determinant of real positive matrices. 

Theorem 12. Let H be an /i x /i real, symmetric, positive matrix. Then 



e~( xH ^cfx- 

w 


(31) 


Proof. It follows from inequality (5)' that the integral (31) converges. To evaluate 
it, we appeal to the spectral theorem for self-adjoint mappings, see Theorem 4^ of 
Chapter 8 T and introduce new coordinates 


^ = My\ (32) 

M an orthogonal matrix so chosen that the quadratic form is diagonalized: 

(jc ? Hx) = (Mv t HMy) = (y\ M"HMy) = . (33) 

The aj are the eigenvalues of H, We substitute (33) into (31); since the matrix M is an 
isometry, it preserves volume as well: |det M| ^ l t In terms of the new variables the 
integrand is a product of functions of single variables, so we can rewrite the right 
side of (31) as a product of one-dimensional integrals: 

I e-^yjcly ^ ^ J| e ， fdyj- (34) 

The change of variable \/ay = - turns each of the integnils on the right in (34) into 
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According to a result of calculus 

■DO 

e^'dz = \fk 、 

so that the right-hand side of (34j equals 

7T n f Z 

顆 = (rw l/2 

According to formula (15)，Theorem 3 in Chapter 6 the determinant of H is the 
product of its eigenvalues; so formtik (3” of Theorem 12 follows from (34) and 
04)\ □ 


(35) 


(34) 


Exercise 8 . Look up a proof of the calculus result (35). 

Proof of Theorem 9. Wc take in formula (35), H = /A + (1 - r)B, where A, B 
arc arbitrary real, positive matrices: 




ydet(/A + (i - i)B) 




r 


e -t(x,Ax) e -( 


il n 


(36) 


Wo appeal now lo Holder’s inequality: 


fgdx < 



wliere /j, q are real, positive numbers such that 


P 



Wc take 

/(x) = f g(x) = ^ 卜他則， 

and choose p — l/t,q — 1/(1 - f); we deduce that the integral on the right in (36) is 
not greater than 


(f e^ [xM) dx]( 
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Using formula (31) to express these integrals we get 

(n-P- \7 ^ /2 广 一 

Since this is an upper bound for (36), inequality (24) follows. O 

Formula (31) also can be used to give another proof of Theorem 10. 

Proof In the integral on the right in (31) wc write the vector variable as 
x — ue\ + ir where u is the first component of x and z the rest uf them. Then 

(.t, H.v) = ftijii 2 + 2ul(z) + (z.Hjis), 


where /{^) ls some linear function of z. Setting this into (31) gives 






e -Ai I ir z)^ u 


VdciH 

Changing the variable u to -w transforms the above integral into 

I 卜-軋沪 +2 “-凡〜* 

Adding and dividing by 2 gives 


II 


e -h u ^^HnZ) C + r du dz 


where c abbreviates e 1 ^. Since c is positive, 

c + c ' 


> L 


(37) 


(37 )， 


Therefore (37)’ is bounded from below by 


II 


e~ hllli "e~^ Hu：) dudz. 


The integrand is now the product of o funclion of u and of z, and .so is Ihc product of 
two integrals, both of which can be evaluated by (31): 

JT 1 / 2 

\ZdetH" 
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Since this is a lower bound for the righi-haod side of (37), we obtain that 
dct H < ""det Hj 卜 Inequality (27) follows by inducuon on the mm of H. □ 


3. EIGENVALUES 

In ibis section we present a number of interesting and useful results on 
eigenvalues. 

Lemma 13, Let A be a self-adjoint map of a Euclidean space II into itself We 
denote by pA} the number of positive eigenvalues uf A, and denote by p^{A) the 
number of its negative eigenvalues. 

/? + (A) — maximum dimension of subspacc S of U mch that (A" 1 ii) is positive 
on S* 

p_(A) — maximum dimension of subspace S of U such {Au, u) is negative on S, 

Proof. This follows from the minmax characterization of the eigenvalues of A; 
see Theorem 10, as well as Lemma 2 of Chapter 8. □ 

Theorem 14. Lei U and A be as in Lemma 13, and let V be a subspace of U 
whose dimension is one less than the dimension of U: 

dim V = dim U - L 

Denole by P orthogonal projection onto V. Then PAP is a sell'-adjoint map of U imo 
U that maps V into V; we denote by B the restriction of PAP to V. We claim that 

P + (A)-1S/MB)Sp + (A), (38)+ 

and 

p„(A)-l<^(B)<^(A). (38)_ 


Proof. L^t T denote a subspace of Vof dimension p+(B) tm which B is positive: 

(Bv, v) > 0, v in T. v # 0. 

By definition of B, we can write this as 


0 < (PAPr, v) — (APv.Pv). 
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Since v belongs to 7\ a subspace of V T — v r So we conclude that A is positive on T; 
this proves that 

/MB) </?+(A). 

To estimate from below, wc choose a subspace S of U，of dimensionp.(A) 
on which A is positive; 

(Ah, u) > 0, u inS ， u ^ 0. 

Denote the intersection of S and V by T: 

T-snv, 

We claim that the dimension of T is at most one less than the dimension of S: 

dim S - 1 < dimT, 

If S is a subspace of V, then T = S and dim T = dim S, If noi, choose i\ basis in 
St {si ，… t & }. At least one of these, say does mil belong to V\ ihis nieans lhais\ 
has a nonzero component onhogonal to V* Then we can choose scalars ^2 ? ….叫 
such that 

― a 2 s I» … Jk 一 叫 

belong to U They are linearly independent, since q ….，叫 are linearly independent. 
It follows that 

climS - 1 < dimT, 

as asserted. 

We claim that B is positive on T Take any v / 0 in T: 

(Bv’v) = (PAPv ， v) = (APv t Pv) = (Av, v )， 

since v belongs to V, Since v also belongs to S t (Ai%y) > 0. 

Since p 十 (B) is defined as the dimension of the largest subspacc on which B is 
positive, and since dimT > dim S — 1 ， p+(B) > p+(A) — 1 follows* This completes 
the proof of (38) + ; (38) = can be proved similarly* □ 

An immediate consequence of Theorem 14 is the following theorem- 

Theorem 15. Let U, V t A, and B be as in Theorem 14 Denote the eigenvalues 
of A as ...., a N , and denote those of B as |, *,.. The eigenvalues of B 
separate the eigenvalues of A: 


<h\<a%< … h < 从 


( 39 ) 
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Proof ，Apply Theorem 14 to A - c and B - c. We conclude that the number of /?, 
less than c is nol greater than the number of fij less than e, and at most one less* Wc 
claim that a { < bjj if nol, we could choose hi < c < Hi and obtain a contradiction* 
Wo can show anidogoosly that fo r < a,+ i . This proves that < b, < a^i , as asserted 
in (39), □ 


Take U io be R rr with the standard Euclidean structure, and take A to be any n x n 
self-adjoint matrix. Fix / to be some natural number between 1 and n, and take V to 
consist of all vectors whose ith component is zero. Theorem 14 says that (he 
eigenvalues of the /ih principal minor of A separate the eigenvalues of A. 

Exercise 9, Extend Theorem 14 to the case when dim V = dim U m , where 
m is greater than 1, 

The following result is of fundamenUil interest in mathematical physics; see T for 
example. Theorem 4 of Chapter 11. 

Theorem 16* Let M and N denote sdf-adjoiol k x k matrices satisfying 

M < N. (40) 

Denote the eigenvalues of M, arranged in increasing order, by wi 1 < * ' < nik ，and 
those of N by /?| < ■ ^ < We claim that 

mj < itj ， j 二 \ ， … 'k. (41) 

First Proof, We appeal to the minmax principle. Theorem 10 in Chapter 8, 
formula (40), according to which 



= min 

aiiT)5=y 

U Mx) 
max , 

^mS ( 文， jf) 

(42), 

fl j 二 

= min 

dim S=/ 

max -「 ■ 

xmS (x* X) 

(42) 


Denole by T the subspace of dimension j for which the mini mum in (42} H is reached, 
and denote by v the vector in T where (x f M,v)/(x, jt) achieves its maximum; wc take 
y to be normalized as || y || = L Then by (42) ftr 


< (j ， My), 


while from (42)^, 


Ny) < nj. 
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Since the meaning of (40) is that {y, M_v) < (y, Ny) for all _v / 0 t (41) 
follows* □ 

If the hypothesis (40) is weakened to M < N, the weakened conclosion nij < nj 
can be reached by the same argument. 


Second Proof We connect M and N by a straight line ： 

A(0 = M + f(N - M); (43) 

we also use c:ilculus f as we have done so profitably in Section 1, Assuming for a 
moment that the eigenvalues of A(i) arc distinct, wc use Theorem 7 of Chapter 9 to 
cnnclude that Ihe eigenvalues ul'A(r) depend differentiably on /, atid we use formula 
(24) of that chapter for the value of the derivative. Since A is Helf-^djoint, we can 
idenufy io ibis formula the eigenvector / of A 1 wilh the eigenvector h of A ilself. 
Normalizing A so that | A || = 1, we have ihe following version of (24), Chapter 9, 
for the derivative of the eigenvalue a in A/i = ah: 



For A(r) in (43), dA/dt 「 N — M is positive according to hypothesis (41); Ihcrcfore 
the right-hand side of (43) ; is positive. This proves that dajd\ is positive, and 
therefore £i(/) is an increasing function of t: in particular, a(Q) < a(l). Since 
A(0) = M* A( I) = N, this proves (41) in case A(l) has distinct eigenvalues for all t 
in [0, I]. 

In case A(/) has multiple eigenvalues for a finite set of /, the above argument 
shows that each is increasing hciwocn two such values of r; thal is enough in 
draw ihe conclusion (41). Or we can make use of the observation made at the end of 
Chapter 9 that the degeneralc malrices form a variety of codimcnsiun 2 and cun be 
avoided by changing M by a small amount and passing to the limit. □ 

The following result is very useful. 


Theorem 17. Let M and N be self-adjoint k x k matrices mj and ty their 
eigenvalues arrayed in increasing order. Then 

k 一川 /I <|]M^N|| # (44) 

Proof Denote || M — N [| by li is easy to sec that 

N - dl < M < N + dL 


Inequality (44》follows from (44/ and (41). 


□ 


(44 ) J 
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Exercise io. Prove inequality (44)’, 

Wielandl and Hoffman have proved the following interesting result. 

Theorem 18, Let M f N be self-adjoint k x k matrices and mj and nj their 
eigenvalues amuiged in iocTeasiog order. Then 

[:( 〜一川 i) 2 < II N - M \\h (45) 

where | N — M ||〕is Ihe Hilbert-Schmidt norm defined by 

II C || 卜 (46) 

Proof. The Hilbert — Schmidl norm of tiny matrix can be expressed as a truce: 


lie 111-tree (46/ 

For C self-adjoint, 

II C 111 = trC 2 . (46 广 

Using (46 )〃 we can rewrite inequality (45) as 

y^(nj — mj)~ < tr{N — M)". 

Expanding both sides and using the linearity and commutativity of trace gives 

- 2 啊 + <uN 2 -2 tr(NM) + trM 2 , (47) 

According to Theorem 3 of Chapter 6, the trace of N 2 is the mm of the eigenvalues 
of According to the spectral mapping theorem, the eigenvalues of N 2 are nj. 
Therefore 

iq = irN 2 , ^2 fn j — 

so inequality (47) can be restated as 

fijtnj > tr(NM) 


(47/ 



To prove this wc fix M and consider all sclf-acljoint matrices N whose eigenvalues 
are /i| ，…， The set of such matrices N forms a bounded set in the space of all 
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self-adjoini matrices. By compactness, there is among these that matrix N that 
renders the right-hand side of (47)’ largest. According to calculus, the maximizing 
matrix has the following property: if N ⑺ is n differentiable function whose 
values arc self-adjoint matrices with eigen valites h 卜 and N{0) = N max , then 


d 


tr(N(f)M) 


(48) 


Lei A denote any anti-self-adjoint malrix; according to Theorem 5, part <e )， 
Chapter 9, e Af is unitary for any real values of ?. Now dcline 


N(/) = 


(49} 


Clearly* N(f) is self-adjoint and has the same eigenvalues as N 誠 * According to part 
(d) of Theorem 5, Chapter 9, 


d 

dt 


e Xt = A^ A/ = ^ V/ A 


Using the rules of differentiation developed in Chapter 9， we get, upon 
differentiating (49), that 


d 

dt 


N(r) = ^(AN^ - N^Aje' 


■At 


Selling this inlo (48) gives at / = 0 


-tr(N(f)M) 

dt 




tr(AN mav M — Njym'AM) = 0. 


f=0 


Using the commutativity of trace, we can rewrite this as 

tr(A(N_M — MN 晒 )）=()■ (48 / 

The commutator of two self-adjoint matrices N n 顧 and M is anti-self-adjoint, so wc 
may choose 

A = N m ,,M - MN rrax , (50) 

Setting this into (48)’ reveals that tr A 2 = 0; since by (46)、for anti-self-adjoinl A, 

irA 2 = 

we deduce that A = 0 t so according to (50) Ihe matrices N_ ax and M commute* 
Such matrices can be diagonalized simultaneously; the diagonal entries are "y and 
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mj, in some order. The trace of N max M can therefore be computed in lhis 
representation as 

(51) 

where pj f j — 1 ” A is some permutation of 1 ， " ., A:. It is not hard lo show, and is 
left as an exercise to the reader, lhat the sum (51) is largest when the fif are arranged 
in the sume order as the mj, that is, increasingly- This proves inequality (47/ for N max 
and hence for all N* □ 

Exercise i i. Show that (51) is largest when n ； and nij are arranged in the same 
order. 

The next result is useful in many problems of physics* 

Theorem 19. Denote by ^min(H) the sinallest cigenviilue of a self-adjoint 
mapping H in a Euclidean space. We claim that ^ m i n is a concave function of H, that 
is，that for 0 < / < 1, 

^min(^L + (1 — ,)M ) 之 ^min(L) + (1 — (52) 

tor any pair of sclf-tidjoint maps L and M. Simihirly, is u convex function of 

H; for 0 < / < U 

emx(fL +(1-/}M)< te mm (L) + (I - 少■醜 (M). (52)’ 


Proof. We have shown in Chapter 8, equation (37), that the smallest eigenvalue 
oi' a mapping can be chiiracterizetJ as a minimum: 

^tnm(H) = (A%H_r), (53) 

Let y be a unit vector where (x, Rv), with H = rL + (1 — f)M reaches its minimum* 
Then 

e mill (，L + (I - t)M) = t(y, Ly) + (I - t)()\ M.v) 

> t min (x\ Lv) +(i — /} min (x, Mx) 

一 IW = r W = i 

= ^min(L) + (1 — O^nnin(M). 

This proves (52). Since —^max(A) — A), the convexity of ^mn*(A) 

follows. □ 

Note that the main thrust of the argument above is that any function characterized 
as the minimum of a collection of linear functions is concave. 
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4, REPRESENTATION OF ARBITRARY MAPPINGS 

Every linear mapping Z of a complex Euclidean space into itself cun be 
decomposed, uniquely, as a sum of a sclf-adjoini mapping and an anti-self-adjoint 
one: 

Z = H + A, (54) 

where 

FT = H, A* = -A. (54/ 

Clearly, if (54) and (54)’ hold, Z 4 = H‘ + A + = H — A，so H anti A are given by 

u Z + 7 f a Z*-Z 
H = —-— , A = — 

2 2 

H is called the self-adjoint part of Z, A the anti-self-adjoint part. 

Theorem 20. Suppose ihc self-adjoint part Z is positive: 

Z + > 0. 

Then the eigenvalues of Z have positive real part* 

Proof. Using the conjugiite symmetry of scalar product in a complex Euclidctin 
space, and the definition of adjoint, wc have the following identity for any vector h: 

2Re(Zft,/3) = (Z/iJi) + (ZM) = (Zhji) + (fuZh) = [ZhJi)+{Z 4 hJt) 

- ((Z + T)h,h). 

Since we assumed in Theorem 18 that Z + Z - is positive, we conclude that for any 
vector h ^ 0, (Zft, h) has positive real part. 

Let h be an eigenvector for Z of norm || h ||= K with z the corresponding 
eigenvalue, Zh — -/h Then (Zh. h) — z has positive real part. □ 

In Appendix 14 we give a far-reaching extension of Theorem 20. 

Theorem 20 can be used tn give another proof of Theorem 4 about symnielrized 
products: Let A and B be self-adjoint maps, and assume that A and AB + BA = S 
are positive. We claim that then B is positive. 

Second Proof of Theorem 4. Since A is positive, it has according to Theorem 1 a 
square rool A 1 2 that is invertible. We multiply the relation 


AB + BA = S 
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by A _,/3 from the right and the left: 

a 1/2 Ba -)/2 + a -1/2 Ba I/2 ^ a -1/2 Sa -1/2_ (55) 

We introduce the abbreviation 

a i/2 ba -i/2 = z (56) 

and rewrite (55) as 

Z+Z # = A - , /2 SA‘ i/2 , (55)’ 

Since S is positive, so, according ro Theorem K is A ^' 2 SA 】’ 2 ; ii follows from (55Y 
that Z + 7* is positive. By Theorem 20 the eigenvalues of Z have positive real part 
Formula (56) shows thai Z and B are similar; therefore they have the same 
eigenvalues. Since B is self-adjoint, it h:is real eigenvalues; so we conclude that the 
eigenvalues of B arc positive. This, according to Theorem 1, guarantees that B is 
positive. Q 

Exercisk 12. Prove that if the self-adjoint part of Z is positive, then Z is 
invertible, and the self-adjoint part of Z 1 Is positive. 

Tlte decomposition of an arbitral Z as a sum of its self-adjoini and ami-self-adjoinl 
pans is analogous to wriiing a complex number as the sum of its real and imaginary 
parls, and the norm h analogous to the absolute value. The next result strengthens this 
analogy. Lei a denote any complex number with positive real part; ihen 

I * az 

Z -— = H r 

a+ az 

maps the right half-plane Re z > 0 onto the unit disc 卜 v| < h Analogously, we claim 
the following: 

Tlieoreiii 2L Let a be a complex number with Re a > 0, Let Z be u mapping 
whose self-adjoint part Z + Z* is positive* Then 

W = (1 (57) 

is a mapping of norm less than 1. Conversely* || W || < 1 implies that X + Z* > 0, 

Proof. According to Theorem 20 the eigenvalues z of Z have positive real part. It 
follows that the eigenvalues of I + aZ. 1 + aZ are # 0: therefore I + is invertible. 

^ _ I / 

For any vector x, denote (I 十 aZ)~ x — y; then by (57), 


(I - aZ ) 产 Wx, 
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and by definition of y, 

(I + a2,)y = x 

The condition [| W || < 1 means that | Wx || 2 < \\ x\\ 2 for all x / 0; in lerms of j 
this can be expressed as 

]\y^aZyf<iy + aZy |[? (58) 

Expanding both sides gives 

II y II 2 ^ M 2 W Zv II 2 -a (Zv.y) - S(y ， Zy) < || y || 2 + |g| 2 || Zy |[ 2 

+ a(Zy,y)+a(y,Zy). (59) 

Cancelling identical terms and rearranging gives 

0 < (a + a)[(Zy,y) + (y 7 Zy)] =2Rea[Z + Z*]y,y). (60) 

Since we have assumed that Re a is positive and that Z + Z* > 0. (60) is true* 
Conversely, if (60) holds for all y,Z + Z* is positive. □ 

Complex numbers z have not only additive but mulliplicative decompositions: 
re' r > 0, - L Mappings of Euclidean spaces have similar decomposi¬ 

tions. 

Theorem 22. Lei A be a linear mapping of a complex Euclidean space into 
itself. Then A can be factored as 

A = RU, (61) 

where R is a nonnegative self-adjoint mapping, and U is unitary. When A is 
invertible, R is positive. 

Proof Take first the case that A is invalible; then so is A \ For any x 4 - 0, 
(AA'x.x) = (A% Al) = 1| At || 2 >0. 

This proves that AA 1 is a positive mapping. According lo Theorem K AA^ has a 
unique positive square root R: 

AA" - R 2 * (62) 

Define U as R— 1 A; then U" = A + R _l , and so by (62) t 

UU* = R^AAT 1 = R _i R 2 R _, = I. 
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It follows that U h unitary, By definition of U as R _l A, 

A = RU ? 

m asserted in (61). 

When A is nol inverlible, AA_ is a nonnegalive self-adjoinl map; il has a uniquely 
determined nonnegative square root R, Therefore 

I Hr || 2 = (R_v. R.v) = (R 2 x } x) = {AA%x) 

= (A、, A'r) = || A—jc I? (63) 

Suppose R_v = Rv; then || R(x — y) |[ = 0, and so according to (63 ) f 
|| A^(jt — y) |1 = 0, therefore — Ky. This shows that for any u in the range of 
R, u — R_v s we can define Vtnxs A"x. Accord!ng to (63), V is an isometry; therefore il 
can be extended io the whole space as a unitary mapping. 

By definition ， A* = VR; taking its adjoint gives A = RV*，which is relation (61) 
with V， = U ，口 

According to the spectral representation theorem, the self-adjoinl map R can be 
expressed hs R = WDW\ where D h diagonal and W h unitary. Setting this into 
( 6 J) gives A = WDW"U, Denoting W*U as V, we get 

A 二 WDV ， （ 64) 

where W and V arc unitary and D is diagonal, with rionncgativc entries. Equation 
(64) is called the singular value decomposition of the mapping A, The diagonal 
entries of D are called the sinttular values uf A; they are the nonnegative square roots 
of ihe eigenvalues of A A' 

Take the adjoint of both sides of (61): we get 

A' ^ L”R, (61)* 

Denote A" as B, denote U" as V, and restate (61) 1 * as 

Theorem 22 # - Every linear mapping B of a complex Euclidean space can be 
factored as 

B = MS ， 

where S is self-adjoint and nonnegative, and M is unitary. 

Note, When B maps a real Euclidean space into itself, so do S and M. 

Exercise 13 . Lei A be any mapping of a Euclidean space inlo itself* Show that 
AA and A A have the same eigenvalues with the same multiplicity. 
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Ekekcise ) 4 , Let A be a mapping of a Euclidean space into another Euclidean 
space* Show lhat A A and A "A have the same nonz 0 n} eigenvalues with the same 
multiplicity. 

Exercise 15. Give an example of a 2 x 2 matrix Z whose eigenvalues have 
positive real pari but Z + Z* is not positive. 

Exerc ise 16 , Verity that the commutator (50) of two selt-adjoim matrices is 
anti-self-adjoim. 



CHAPTER 11 


Kinematics and Dynamics 


In this chapter we shall illustrate how extremely useful the theory of linear algebra in 
general and matrices in particular are for describing motion in space. There are three 
sections, on the kinematics of rigid body motions, on the kinematics of fluid flow, 
and on the dynamics of small vibrations. 


1. THE MOTION OF RIGID BODIES 

An isometry was defined in Chaplcr 7 as a mapping of a Euclidean space into itself 
that preserves distances. When the isometry relates the positions of a meclianica] 
system in three-dimensional real .space at two different limes, it is called a rigid body 
motion. In this section we shall study such molions. 

Theorem 10 of Chapter 7 shows that an isometry M that preserves rhe origin is 
linear and satisfies 


M_M = L (I) 

As noted in equation (33) of that chapter, the determinant of such an isometry is plus 
or minus 1 : its value for all rigid body motions is L 


Theorem 1 (Euler)* An isometry M of three-dimensional real Euclidean space 
with detemiinant phis 1 that is nontrivial, that is equal to I t is a relation; it has a 
uniquely defined axis of rotation and angle of rotation 6, 

Proof. Points f on the of mmiion remain fixed, so they satisiy 

M/=/； (2) 
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that h % they are eigenvectors of M with eigenvalue 1, We claim lhat a nontrivial 
isometry, dct M — I, lias exactly one eigenvalue equal to 1. To see this, look at the 
characteristic polynomial of p(s) — det(^I — M) ‘ Since M is a real matrix, pis) 
has real coefficients. The leading tenn in p(s) is . so /j(^) tends to +oo as s lends 
to +oc* On the other hand, /?(0) = det(-M) = 一 det M = - L So p has a root on 
the positive axis ； that root is an eigenvalue of M. Since M is an isometry, that 
eigenvalue can only be plus 1, Furthermore, 1 is a simple eigenvalue; for if a second 
eigenvalue were equal lo I, then, since the product of all three eigenvalues equals 
del M = I, ihe third eigenvalue of M would also be I. Since M is a normal matrix, it 
lias a full set of cigcnvcclors, all with eigenvalue I; that would make M = I, 
excluded m the trivial case, 

To see that M is a rotation around the axis formed by the fixed vectors, we 
represent M in an orthonormal basis consisting of /satisfying (2), and two other 
vectors. In this basis the column vector (K0,0) is an eigenvector of M with 
eigenvalue 1; so the first column is (l T 0,0). Since Ihe columns of an isometry are 
onhogonal unit veclors and M = I, Uie mairix M has ihe form 



where c 1 + s 1 — 1, Thus t_; cos 0, s ^ sin B, $ some angle. Clearly, (3) is rotation 
around the first axin by angle 0, □ 


The rotation angle is easily calculated without inlroducing a new basis that brings 
M into form ⑶. We recall the definition of trace from Chapter 6 and Theorem 2 in 
that chapter, according to which similar matrices have the same trace. Therefore, M 
has the same trace in every basis; from (3 )， 


hence 


tr M = 1 + 2 cos 0, 


co^$ — 


tr M s 1 




( 4 , 


We turn now to rigid motions which keep the origin fixed and which depend on 
time r. that is. functions M(/) whose values arc rotations. We take M(r) to be the 
rotation that brings the configuration at time 0 into the configuration at time r* Thus 

M(0) = l (5) 


If we change the reference time from 0 to f 卜 the function M] describing the motion 
from i\ to / is 


M,(/) = 


⑹ 
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Equation (I) shows that M 峰 is left inverse of M; then ii is also right inverse: 

MM + = 1. (7) 

Wc assume that M(/) is a differentiable function of r Differentiating this with respect 
to / and denoting the derivative by the subscript i gives 

M,M_ +MM; = 0 . ( 8 ) 

Wc denote 

M f M + = A. (9) 

Since differentiaiion and taking the adjoint commute, 

A* = MM；; 

therefore ( 8 ) can be written as 

A + A" = 0. (10) 

This shows thal A(r) is antisymmetric* Equation (9) itself can be rewritten by 
multiplying by M on the right imd using (I); 

M, = AM, (11) 

Note that if we differentiate ( 6 ) and use (11) we get ihe same equation 

M h = AM 卜 (I l)j 

This shows the significance of A(t), for the motion is independent of the reference 
time; A (0 is called the infinitesimal generator of the motion* 

Exercise i. Show that if M(f) satisfies a differential equation trf form (11 )， 
where A(0 is antisymmetric for each t and the initial condition (5), then M(/) is a 
rotation for every t. 

Exercise 2 , Suppose that A is independent of t: show that the solution of 
eejuation (II) satisfying the initial condition (5) is 

MO)(12) 

Exercise 3 , Show that when A depends on t, equation (II) is nor solved by 

M(/) = 春， 

unless A ⑺ and A(^) commute for all 5 and t. 
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We investigate now M(/) near t — 0; we assume that M(i) / I for t / 0; then for 
each / ^ 0 , M{/) has a unique axis of rotalion/U): 

We assume that f[t) depends differentiably on t; differentiating the preceding 
formula gives 




We assume that both/(r) and f r (t} have limits as t —> 0, Lelting f — 0 in this formula 
gives 

M/(0) + MfO^ =f t . (13) 

Using (11) and (5)，we gel 

A 〔 0)/(0)=0. (14) 

Wc claim that if A(0) ^ 0 then this equation has essentially one solution, that is, all 
arc multiples of each other To see that there is a nontrivial solution, recall that A 
is antisymmetric; for n odd, 

detA = det A* = det(—A) = (—l) n det A = — det A, 

from which it follows that det A — 0, that is, the determinant of an antisymmetric 
matrix of odd order is zero. This proves that A is not inveitible. so that (14) has a 
nontrivial solution. This fact can also be seen directly for 3x3 matrices by writing 
out 


Inspection shows that 




lies in the nullspacc of A. 


05} 


(16) 


Exercise 4 , Show lhal if A in (15) is not equal to 0, then all vectors annihilated 
by A arc multiples of (16). 


Exrrcise 5. Show that the two other eigenvalues of A are ±iVf^ + b 2 + 
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ExtRasE 6* Show that the motion M(/) described by (12) rotation around the 
axis through the vector/given by formula (16)* Show that the angle of rotation is 
ty/a 1 + b 2 c 2 . (Hint: Use formula (4) 夂） 

The one-dimensional subspace spanned by J{0) satisfying (14), being the limit of 
the axes of rotation is called the insiantaneoits axis of rotation of the motion at 
t = 0, 

Let 0{t) denote the angle through which M(/) rotates. Formula (4 / shows that 0(t) 
is a differentiable function of t: since M(0) = i i( follows that ir M(0) = and so 
by (4y cos 0(0) = K This shows that 0(0) — 0, 

We determine now ihe derivative of 0 at 才 = 0. For this purpose we differentiate 
(4/ twice with respect to t. Since trace is a linear function of matrices, the derivative 
of the trace is the trace of the derivative, and so we get 

~9 ff sin 9 — $f cos M n , 

Setting f = 0 gives 

允⑼ 一 pM〆。). （ 17) 

To express M ； r (0) we differentiate (11): 

M rr = A r M + AM f - A f M + A 2 M. 

Setting f = 0 gives 

H,(0) = A,(0) + A 2 (0), 

Take the trace of both sides. Since A(/) is antisymmetric for every r, so \sA f : the trace 
of an antisymmetric malrix being zero* wc get trM, ； (0) — (r A 2 (0), Using formula 
(15 )，a brief calculation gives 

ir A~0) = -2(a 2 + b 2 + c 2 ). 

Combining the last two relations and setting it into (17) gives 

巧 (0) = a 2 + 6 2 + c 2 . 

Compare this with (16); we get 

陶 = lfl. (18) 


The quantity 0 t is called the instantaneous angular velocity of the motion; the vector 
f given by (16) is called the instantatteoas angular velocity vector. 
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Ekhrcise 7 , Show that the commutator 

[A,B]^AB-BA 

of two antisymmetric matrices is antisymmetric. 

txEKCiSE 8. Let A denote the 3 x 3 matrix (15); we denote Lhc associated null 
vector (16) by/i. Obviously, / depends linearly on A. 

(a) Let A and B denole iwo 3x3 anlisymmelric rmitrices. Show that 

trAB = 一 n/ A jBh 

where (J denotes the standard scalar product for vectors in K 气 
Exercise g* Show that the cross product can be expressed 

f!A ， 0l ~ /a 父 J)i ， 


2. THE KINEMATICS OF FLUID FLOW 

The concept of angular velocity veclor is also useful for discussing motions that are 
not rigid, such as the motion of fluids. We describe the motion of a fluid by 

.r - x(y, i); (19) 


here x denotes the position of a point in the fluid at time t that at time zero was 
localed at y: 

x(yM) ^ y\ (I9) 0 

The partial derivative of jc with respect to /, y fixed, is the velocity v of the flow: 

瓦文 (: M) = x t i% t) -= v(y\t). (20) 

The mapping y —> x, t fixed, is described locally by the Jacobian matrix 



It follows from (19)o that 


J(j.0) = I 


(21) 0 
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We learn in the integral calculus of functions of several variables that (Ke 
determinant of the Jacobian J(,, /) is the factor by which the volume of the fluid 
initially tit y h expanded at time r We assume that the fluid is never compressed to 
zero. Since at f = 0 T dctl(y 5 0) = det 1 = 1 is positive, il follows Lhtit del J(v, is 
positive for all f ， 

We appeal now to Theorem 22、 of Chapter 10 to factor the matrix J as 

J - MS, (22) 

M — M(y ； t) a rotaiion, S = S(> 、 i) selfadjoinl and positive. Since J is real, m arc M 
and S. Since det J and det S are positive, so is det M. Since 1(，）— I as f — (X it 
follows, see the proof of Theorem 22 in Chapter 10, That also S and M —^ I as / ^ 0, 
It follows from the spectral theory of self-adjoint matrices that S acts as 
compression or dilation along the three axes that arc the eigenvectors of S. M is 
rotation; we shall calculate n«w the rate of rotation by the action of M. To do this we 
clifferenliate (22) with respect to t: 

J, ， MS r + M f S. (22)' 

Hli 

We multiply (22) by M on the left; since M 傘 M = I we get 

M*i ^ S. 


We nnilliply this relalion by M ； from the left, make um of the dilTerential equation 
M, = AM, see (11), and (hat MM* — L 

M f S ^ AMNTJ = AJ. 


Setting this into (22/ gives 


J, = MS, + AJ. (23) 

Sel t — 0: 

J,(0) = _ + A(0), (23) 0 

We recall from (10) that A(0) is anti-self-adjoint S, on ihe other hand, being Ihe 
derivative of self-adjoint matrices, is itself self-adjoint* Thus (23)。is the 
decomposition of J^O) into its self-adjoint and anti-sdt-adjoint parts, 
Difrerentiaiing (21) with respect lo t and using (20) gives 


dv 


(24) 


dv* 

恥. 


that is. 


(24/ 
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Thus the self-adjoint and ami-self-adjoint parts of J/0) are 



In (15) we have given the mimes a, i, c to Ihe entries of A: 



Set this into formula (16) for the instantaneous angular velocity vector: 

/ dv^ dv 2 \ 


dyi 



咖 2 

\dy\ dy 2 ) 


curl v. 


(25) 

(25) y 


(26) 


In words: A fluid that is flowing with velocity v has instantaneous angular velocity 
equal to j curl v, called its vorticity. A flow for which curl r ^ 0 is called 
irmrat tonal 

We recall from advanced calculus that a vector field v whose curl is zero can, in 
any simply connected domain, be written as the gradient of some scalar function 0. 
Thus for an irrotational flow, Ihe velocity is 

v = gmd 0: 


<f> is called the velocity potential 

We calculate now ihe rare at which the fluid is being expanded. We saw earlier 
that expansion is del L Therefore the rate at which fluid h expanded is (dldt) det J. In 
Chapter 9, Theorem 4, we have given a formula, equaiion (10), for ihe logtiriihmic 
derivative of the determinam: 

d 

^logdetJ = (27) 

We set r — 0; according to (21 ) 0 , J(0) = I; therefore wc can rewrite equation (27) as 

^detJ(O) = lrJ f {0), 
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In words: A fluid chat is flowing with velocity v is being expanded at the rate div v* 
That is why the velocity field of ari incompressible fluid is divergence free. 


3. THE FREQUENCY OF SMALL VIBRATIONS 

By small vibrations wc mean motions of small amplitude about a point of 
eqailibrium. Since the amplitude is small, (he equation of motion can he taken to be 
linear. Let us start with the one-dimensional case, the vibration of a mass m under the 
action of a spring. Denote by x — x(f) displacement of the mass from equilibrium 
v = 0* The force of I he spring, restoring the m【iss toward equilibrium, is taken to 
be k a posilive consiani, Newton's law of motion, force equals mass times 
acceleration, says that, 


t?ix -h = 0; 

here the dot symbol - denotes differentiation with respect lo /, 
Multiply (28) by x: 


(28) 


m'xx + AxV 


cl 




0: 


therefore 


-mk 1 + ^x 2 ^ E 
2 2 


(29) 


is a conslant^ independent of/. The first term hi (29) is the kinetic energy of a mass m 
moving with velocity x ： the second term is the potential energy stored in a spring 
displaced by the amount x. Thai iheir sum, E, is conslaiit expresses ihe conservation 
of total energy. 

The equation of motion (28) can be solved explicitly: All solutions are of ihe form 


x(f) = cf sin 


{^lh +e 


(30) 


a is called the amplitude, 9 the phase. All solutions (30) are periodic in r, with period 
"— 2ny/m/k, The frequency, defined as ihe reciprocal of the period, is the number 
of vibrations the system performs per unit time: 


frequency — — y 


(31) 


By (24)\ — dPi/dyi ： therefore 




也 |^ 

E 


de 
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We note that part of this result can be deduced by dimensional analysis. From the 
fact thai Lx is o force, we deduce that 

dim k - length ^ dim force 

, s mass - length 

=mass ， accclcraiion = - ^ - . 

timer 

So 

. r mtiss 
dim t = -—— ^, 
time" 

The only quantity constructed out of the two parameters m and k whose dimension is 
time is const \fmjk. So we conclude that the period p of motion is given by 


I ” 一 COI】Sl 

Formula (31) shows is an increasing function of k, and a decreasing 

function of m. Intuitively this is clear; increasing k makes the spring stiffer and the 
vibration faster; the smaller the mass, the faster the vibration. 

We present now a far-reaching generalization of this result to the motion of a 
system of n masses on a line, each linked elastically to each other and to the origin. 
Denote by Xi the position of the ith particle; Newton’s second law of motion for the 
/th particle is 

m 太一 / =0 ， (32) 

where/f is the total farce acting on the Ah particle and rrii h its mass. We take the 
origin to bea point of equilibrium for the system, that is, all f\ are zero when all the Xj 
are zero* 

We denote by the force exerted by the j'th particle on the /th. According to 
Newton's third law, the force exerted by the ith particle on ihe jih is —f 中 We take fij 
to be proportional to the distance of Xj and xj ： 

fij — — 太 ih * 參 h (33) 

To satisfy — -fji we take ky — Finally, we take the force exerted from the 
origin on particle / to be -k,Xi* Altogether we have 

fi — ^ kijXj, kn = ~ki - ^ kij ， (33/ 

j j 

We now rewrite ihe system (32) in matrix Ibrm as 


Mv+i<Lv = 0: 


(32/ 
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here x denotes the vector (太卜 … ,x ti )\ M is a diagonal matrix with entries m h and 
the elements of K are _ky from (33)' The matrix K is real and symmetrici then 
taking the scalar product of (32)’ with x we obtain 


(i\ MvJ + (i, Kv) = 0. 


Using the symmetry of K and M we cun rewrite this as 


d ■ 

It ■ 

from which we conclude that 


十 2( 夂 



^(xMx)+^(x,Kx) = E (34) 

is a constant independent of r. The first term on the left-hand side is the kinetic 
energy of the masses, the second term the potential energy stored in the system when 
the particles have been displaced from the origin to x. That their sum, E, is constant 
during the motion is an expression of the conservation of total energy. 

We assume now that all the forces are attractive, that is, that A；y and k, are positive. 
We claim that then (he matrix K is positive. For proof see Theorem 5 at the end of 
this chapter According to inequality (5)' of Chapter 10, a positive matrix K satisfies 
for all x\ 


a || || 2 < (x. Kjt)j a positive. 

Since the diagonal matrix M is positive* combining the above inequality with (34) 
gives 

a\\.x\\ 2 <E. 

This shows that the amplitude || x || is unifonnly bounded for all time, anti 
furthermore if the total energy E is sufficiently small, the amplitude | x || is small 
A second important consequence of the positivity of K is 

Theorem 2* Solutions of ihe ditTeremial equation {32 / are uniquely 
determined by iheir initial data .t(0) atid 乂 (0). That is, two solutions that have the 
same initial data are equal for all Lime. 

Proof. Since equation (32)’ is linear，the difference of two solutions is again a 
solution. Therefore it is sufficient to prove that if a solution v has zero initial data, 
then x(t) is zero for all t To see this, we observe that if x(0) = 0, ,t(0) = 0, then 
energy £ at / = 0 is zero. Therefore energy is zero for all 匕 But energy defined by 
(34) h the mm of two nonnegative terms; therefore each is zero for all t, □ 
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Since equation (32)’ is linear, its solutions forma linear space. We shall show that 
the dimension of this space of solutions is < 2ru where n is the number of masses. To 
see this, mnp each solution Jt(i) into its initial data jr(0), i(0). Since there are n 
panicles, their initial data belong lo a 2H-dimcnsiomtl linear space. This mapping is 
linear; we claim that it is 1-to-l. According to Theorem 2, two solutions with the 
same initial data are equal; in particular the nullspace of this map is {0}, Then it 
follows from Theorem 1 of Chapter 3 that the dimension of the space of solutions is 
<2tu 

We turn now to finding all solutions of the equations of motion (32)' Since the 
matrices M and K arc constant，differentiating equation (32 / with respect to t gives 

Mx + Kx = 0. 

In words: lf_v(0 is a solution of (32)'，so is 

The solutions of (32 / form a finite-dimensional space. The mapping x —^ x maps 
this space into itself. According ihe spectral theorem* the eigenfunctions and 
gcncrcalizcd eigenfunctions of this mapping span the space. 

Eigenfunctions of the map x x satisfy the equation x — ax; the solutions of this 
are x{t ) 二 e m h ，where aha complex number, h is a vector with n components, and 
n is the number ol particles. Since we have shown above that each solulion of (32)' is 
uniformly bounded for all r, it follows that a is pure imaginary: a — ic, c real. To 
determine c and h wc set x — e ut h into (32)'* We gcU after dividing by e ⑽’ that 

c 2 M/i = K/i* (35) 

This is an eigenvalue problem we have already encountered in Chapter 8, equation 
(48). We can reduce (35) to a standard eigenvalue problem by introducing 
M 1/2 /i — k as new unknown vector into (35) and then multiplying equation (35) on 
the left by M We get 


c 2 k = M l/2 KM- l/2 L (35)' 

Since M l/2 KM is self-adjoint, it has n linearly independent eigenvectors 
k]” … k ”， with corresponding eigenvalues c^ .,. t Since, as wc shall show, K is a 
positive matrix, so is Therefore the Cj arc real numbers; wc take them 

to be positive* 

The corresponding n solutions of the differential equation (32/ are whose 

real and imaginary parts also are solutions: 

(cos cj^hjj (sincjr}/iy, (36) 

as arc all linear combinations of them: 

} aj(QOS€jt)hj + ^ fe^(sin cjt)hj = jr(f): (36/ 

the ai and hj arc arbitrary real numbers* 
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I'heurum J. Evu-ry ^uljiiun of ihc diMbrcnlial equation (32/ is of ft>rm (36). 

Pnutf. Solutions of the form 06) ! form a I^-dimcn^onal s-pacc. We have shown 
that the of all ^iolution^ a linear space of diineh^ion < 2n. h follows that a! 3 
NolulicifiK nf form (3 好 . O 


ExEftriSE io. Verify that ^luttnns of the form (36) form a ZM-dicncminiiai 
linear spa«. 

The special sulucinns ( 36); 41 ^ LMilled normai mod^t eych is pericxJiL-, with period 
2nfq and frequency These are ca]kd the tmmml J'retiuencies of the 

mt^ch^n；ical system governed by cquatitm (32/. 

Thecirem 4. Consider two differeniia! equations of form (32V ： 

Mi + Kjt ^ D, Nv + — 0, (37) 

M N. L pnsitivc. real n x u rn^triecs. Suppose that 

M>N and K < L. (3S) 

Denote 2 jt times ihe natural frequencies of the first system, arranged in increasing 
order by €i < .… 竺 and those of die Kec»nti syslem by di < <d„- We ctiiirn 

thm 

<V < 4 - / = I,.,. ,w. (39) 

Pmofr We inlmducc an iniermediate differential cqualion 
M£ + Lz = Q 7 

Dcnutc Us natural fnequendea byJ^/2jr. In analogy with cquatiun (35 L Ihc satisfy 

fMh = U) f 

where A is an eiEenvecior. In analogy with equation (35/, we can identify the 
numbers f 1 as the eigenvalues of 


We recall that the numbers d are eigenvalues of 
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Since K is assumed to be < L, ii follows from Theorem I of Chapter 10 that also 

M -l/2 KM -l/2 < M t/2 LM Ail 


Then it follows from Theorem 16 ol' Chapter 1() that 

l..』- (3 吖 

On the other hand, in analogy with equation (35)"，we can identify the reciprocals 
1 // 2 as the eigenvalues of 

L -I/2 ML -!/2 

whereas ihc reciprocals I fd 2 arc Ihc eigenvalues of 



Since N is assumed, to be < M, il follows as before that 

i/2 nl m/2 $ L~" 2 ML - 气 


so by Theorem 16 of Chapter 10 

(39) " 

We can combine inequalities (39/ and (39/- lo deduce (39). □ 

Note: If either of the inequalities in (38) is strict, then all the Inequalities in (39) 
arc strict. 

The intuitive meaning of Theorem 4 is that if in a mechanical system we stiffen 
the forces binding the particles to each other and reduce the mass of all the particles, 
then all natural frequencies of the system increase. 

We supply now the proof of the positivity of the matrix K. 

Theorem 5. Suppose that the numbers and i are positive. Then the 
symmetric matrix K, 

Kij = M 心 = 私 + [ % (40) 

¥i 

is positive. 
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Proof. It suffices w show that every eigenvalue a of K is positive: 

Ku = au r ( 41 ) 

Normalize the eigenvector u of K so that the largesi component, say u t \ equals K and 
all others are < l * The ith component of (41) is 

K" + KyW/ = ch 

Mi 

Using the definition (40) of the entries of K t this can be rewritten as 

ki + ^ kij( I - uj) = a. 

The left-hand side is positive; therefore, so is the righi-hand side a. □ 

For a more general result, see Appendix 7. 



CHAPTER 12 


txamptes of C&nvtx Sets 

(at K = the whole space X. 

(b) K = ifr, rhe empty }<el, 

(c) K = {jc}, a single point, 

(d) K = any line segment, 

(c) Lei / bq u linear fimetinn in X\ then the scls 

/(j) = c. called a hyp 牵 rpbm 
l(x) < c. called an open httlf-sfiace, 
/(.t) < t'^alLcd a vkmed half-space, 

axe all gonveH setK> 


Convexity 


Convcsity is a primitive nuliun. biLscd un nothing but the bare Ixmcs of the slnicuin.- 
iif lin^iu' spaces over the reals. Yei some of Els basic reKiilis pfe surprisingly deep; 
furthermore, (hese resuhs make (heir appearance in an asronishingly wide variety of 
topics. 

X is ii liiKur space tiver ihc nt:uls. Fur any pair of vetlorh v„ y in X, I he lire aejiment 
with endpoints v and y h Jehred as the set of points in X of form 

ai+(l - a))?, 0 < d < I, (l) 

Dcfiniiian. A scl K in X is called mnve.v if, whenever x and y bc]unji U) all 
poinls t>f ihc line scjjnK'nE with endpoints d_ hekuig K. 


(2(3f4 


Linear Al^ebm and tls Apptk f atiom P Sevartd Edition, by ft.ier D. Lax 
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(Di X the space of all polynomials with real coefficient, K ihc subset of all 
polynamials lhat are posUive al every point of the interval (0. U. 

Cgl X (be space of real, ^df-adjoitu inutric^, K the subset of positive inairiccii. 


ExcRriSc l. Verify lhat these are convex fuels, 

Thtmrin L <a| The intersection pf any coilcctior of cnnvtx sets js ctuivcn. 

(h) The sum of two convex sch is cf>nvex. where llw Mim of two K and H is 
delink) as ihc sel of aHI sums x + >', jc in K, y in H. 

EXRKt.lSE 2 . Prove these propositions. 

Using ： Theorem L wc can build im a^iunLshing variety of convex sets out of a lew 
bii»ic ones, F'br in^Lnnee,. a uiangle in [he plane in Ihc intcnicctimn «f three hulf-planes. 

Definition. A puiit x is tuIk-J mi hueriorpoint i]f a setS in X if for every y in 
j + yt belongs to S for aJl sufficiently small positive t. 

Dejinition. A convex scl A'inX is called open U every puint in iE k nn intmur point. 

Exi^tlsiv 3 . Show Lh^il m open half-space (3) ii ； an open convex set. 


B\RPtc isf. 4 , Show thui h tTp«fi ccmvex scl yitil B is convex, ihcn 4 + i? is 
open and convex. 

Dvjirtitioii. Lut btiui open trunvi^K sci Lhat Lhtr vector 0, We define 'm 

尺即 functimi p K = p^s follows ： Fesr every x in X. 

p(jr) — infr, r > 0 iirnJ - in Jf. ( 5 ) 

Ekerhsl 5, Let A 1 be a Euclidean space, and kt K be ihe open ball of radius« 
ecntertjd at Iho nrigin; || .v || < a. 

(1) Show Ehul AMs a convex seL 

UU Shaw that (lie gau^c function of /f is p(j) = || jit j|/b. 


ExEEtriSQ 6. In the p\umi luku j^T to be tht qu.dner-plj.ne u < ], v < L. 
Show iha( the gauge funclion of is 


, 0 

P—) = ， :: 

mEut(ir, v) 


if u < 0, v<0. 

if [)<!>, w < 0. 

if t)< w, v <0, 

if 0 < 0 < v. 
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Theorem 2 - (a) The gauge function/) of an open convex set K that contains the 
origin is well-defined for every x. 

(b) p is positive homogeneous: 

— iof a > 0 . ( 6 ) 

(c) p is subadditive: 

p{-^ + y) <p{^) + p{y)^ 0) 


(d) p(x) < 1 iff ;r is in K, 

Proof, Call the set of r > 0 for which x/r is in K admissible for x. To prove 
(a) we have to show that for any x the set of admissible r is nonempty. This follows 
frtini the assumption lhat 0 is an interior point of K. 

(b) follows from the obsentition thal ifr is admissible for x and a > 0. then nr is 
admissible for ax, 

(c) Let i and i be positive numbers such that 

p{^) < ^ p(y) < ^ ⑻ 

Then by definition of p as inf, it follows that s and r are admissible for x and y; 
therefore x/s and y/t belong to K. The point 


x + y 
s + t 


S X t V 

^ __ 

s + 15 s + f r 


(9) 


lies on the line segment connecting x/s and y/t. By convexity, (x + y)/s + t belongs 
to K. This shows that .t + / is admissible for x 十 y; so by definition of /?, 

P(x + y) < ^ + (10) 


Since s and t can be chosen arbitrarily close to p(x) and /?(y), (c) follows. 

(d) Suppose p(x) < 1; by definition there is an admissible r < L Since r is 
iidmissiblc, x/r belongs to K. The identity .r = rx/r + (1 — r)0 shows lhat x lies on the 
line segment with endpoinLs 0 and x/i% so by convexity belongs to K. 

Conversely, suppose a: belongs to A"; since jc is assumed lo be an interior point o( K 
the point 戈 + « belongs to K for € > 0 but small enough. This shows that 
r — I /{I + e) is admissible, and so by definition 


Kx) -TT7- 


This completes the proof of the theorem. 


□ 
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ExtRasB j. Let f) be a positive homogeneous, subadditive function. Prove that 
the set K consisting of all x for which < ] is convex and open. 

Theorem 2 gives an antilytica] ticscriplioo of the open convex sets. There is another, 
dual description. To derive it we need the following basic, and geometrically intuitive 
results. 

Theorem 3. Let K be an open convex se(, and let y be a poini not in K. Then 
there is an open half-space containing K bui not y. 

Proof. An open half-space is hy definition a set of points satisfying inequality 
/(x) < c; see (3), So we have to construct a linear function l and a number c ： such that 


l(x) < c for all X in K t (II) 

(! 2 ) 

We assume that 0 lies in K; otherwise shift K, Sel ^ = 0 in (11); we get 0 < c. We 
may set c = 1.. Let p be the gauge function of K\ according to Theorem 2, points of K 
arc characterized hy p(x) < 1 :- It follows that (II) can be stated so: 

If p{x) < L Ihen l(x) <1. (11) ? 

This will uerlainly be the case if 

/(x) < "㈤ for all x. (13) 

So Tlieorem 3 is a consequence of the following: there exists a linear function / 
which satisfies (13) for all x and whose value at y is l. We show first that the two 
requireinenls are compatible. Requiring l{y) — 1 implies by linearity that l(ky) — k 
for all k. We show now that (13) is satisfied for all x of form ky; that is，for all k. 

k = i{ky) <p(*y) } (14) 

For k positive, we can hy (6) rewrite this as 

* < kp(y), (14)' 


true because y does not belong lo K and so by part (d) of Theorem 2, p(j) > 1, On 
the other hand inequality (14) holds for k negative: since the left-hand side is less 
than 0, the right-hand side, by definition (5) of gauge function，is positive. 

The remaining task is to extend I from the line through v to all of A" so that (13) is 
satisfied The next theorem asserts that this can be done* 
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Theorem 4 (Halin-Banach). Let /> be a real-valued positive homogeneous 
subadditivc function defined on a linear space X over R- Let t/ he a subspucc of X on 
which a linear function is defined, satisfying (13): 

/(m) < p(u) for all it m U. (13 )^； 

Then f can be extended to all of X so that (13) is satisfied for all x 

Proofl Proof is by induction; we show that / can be extended to a subspace V 
spanned by U and any vector s not in U. Thai is, V consists of all vectors of form 

y = » -f iz, u in U, t any real number. 

Since / is linear 

l(v) = l{li) + tl(z )： 

this shows thal the value of l(z) = a determines the value of / on V: 

I(v) = l(u) + ta. 

The task is to choose a so thal (13) is satisfied: /(y) < p(v), that is, 

l(u) +ta < p(u + tz) (I3) v 

for all u in U and all real t. 

We divide (13)、，by |/|. For t > C) f using positive homogeneity of/J and linearity of 
/ we get 

i(n*) + « < p(tt ¥ +z), (14)+ 

where u* denotes ujt. For r < 0 we obtain 

I(tr) - a < p(ir - zi (14)_ 

where denotes —uji. Clearly, (13) r holds for all u in [/and all real / itf(l4) [ and 
(14)„ hold for all u* and «•* ， respeclively, in U. 

We rewrite (I4) ± as 

W) - pW~ z)<a< p(u" + z)- /(«*)； 

the numbers has 1o be m chosen that this holds for all u* and if* in U. Clearly, this is 
possible iff every number on the left is less than or equal to any number on the right, 
that is, if 


/(“”）一 ~ z ) < p ( u * + z )- /(«*) 


(15) 
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for all u mw in U. We can rewrite this inequality as 

l(u^) + I(^) < 〆〆 + z) + p(u^ - z). (15 / 

By linearity, the left-hand side can he written as l(u tr + w e ); since (13)^ holds, 

/(«" + « + ) < p{i^ + u l 

Since p is subaddilive, 

p(u*^ + u") — p(u** - z + u* + z) < p(u n * — z) + p{u + z). 

TIiis proves (15)’，which shows that / can be extended to V Repealing this n limes, 
we extend I to the whole space X. □ 

This completes the proof of Theorem 3. □ 

Note, The Hahn-Banach Theorem holds in infiniic-dimcnsional spaces. The 
proof is the same, with sonic added logical prestidigitation. 

The following result is an easy extension of Theorem 3. 

Thecirem 5 + Let K and H be open convex sets that are disjoint. Then there is a 
hyperplane that separates them. ThiU is. there is a linear function / and a constant d 
such that 

<d on K. l(y) > d on H. 

Proof. Define ihe difference K — H \o consist of all differences x ― y 7 x in K, y in 
H. It is easy to verify that this is an open, convex set. Since K and H are disjoint, 
K - H docs not contain the origin. Then by Theorem 3, with y — 0, and therefore 
c = 0, there is a linear funcriori f that is negative on K — H: 

l(x — y)<0 for x in K.y in H, 

We can rewrite this as 

/ (x) < /(> ? ) for all X in K,y in H. 

Il follows from the completeness of real numbers Lhat there is a number such that 
for in K f y in H f 

i(x) <d< l{y). 


Since both and H are open* the sign of equality cannot hold; this proves 
Theorem 5, □ 
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We show next how 10 use Theorem 3 to give a dual description of open convex 
sets. 


Definition. Let S be any set in X. We define its support function qs on ihe dual 
X f of X as follows: 


q s (l) = sup l(x), 
x in S 


⑽ 


where / is any linear function. 

Remark, qs(l) may be oo for some 人 

Exercise 8 „ Prove that the support function qs any set is suhaddiitve\ that is t 
it satisfies + /) < gs( m ) + ys(/) for all /， ", in X' 

Exrrcise 9 . Let 5 and T be arbitrary sets in X. Prove that gs+r(l) — 

fls ⑴十 扣⑺. 

Exercise jo. Show ihat q 祕 (f) = max{^(/)^r(/)). 

Theorem 6 , Let K be an open convex set, its support function. Then x 
belongs to A ： iff 

l(x) < ^k(I) (17) 

for all / in X f . 


Pmof, It follows from definition (16) that for every x in K /(,r} < t/K(l) for 
every /; therefore the strict inequality (17) holds for all interior points a in K. To see 
the converse, suppose that y is not in K. Then by Theorem 3 there is an / such that 
f(x) < I for alLf in K. but /( v) = 1 Thus 

l(y) = 1 > SL1 P l (^) = 狀 (0; ( i8 ) 

jt tn K 

this shows that y not in K fails to satisfy (17) for .some This proves 
Theorem 6, □ 

Definition. A convex set K in X ifi called closed if every open segmeni 
ax 十 （1 — o)v, 0 < a < K that belongs lo K has its endpoints ^ and v in K. 

Examples 

The whole space X is closed 
The empty set is dosed, 
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A set consisting of a single point is closed. 
An interval of form (I) is closed. 


Exercise ik Show that a dosed half-space as defined by ( 4 ) is a closed con¬ 
vex set. 


Exercise 12 , Show that the closed unit ball in Euclidean space, consisting of all 
points I a - || < 1 , is a closed convex set. 

Exercise 13 . Show that the intersection of dosed convex sets is a closed 
convex seL 


Theorems 2, 3 t and 6 have their analogue for closed convex sets* 

Theorem 7, Let K be a closed, convex set, and y a point not in K. Then there is a 
closed half-space that contains K but not v. 


Sketch of Proof. Suppose A" contains the origin. IF K has no interior points, il lies 
in a lowcr-dimcnsional subspacc- If it has an interior point, wc choose it to be the 
origin. Then the gauge function of can be defined as before. If a belongs to A*, 
we may choose in the definition ( 5 ) of the value r 二 1; this shows that for x in K, 
p(x) < l t Conversely, if p^(v) < I, ihen by (5) x/r belongs to K for some r < L 
Since 0 belongs to K y by convexity so does x. If = h then for all r > i, x/r 
belongs to K. Since K is closed, so does the endpoiot x This shows that K consists 
of all points x which satisfy < l. We then proceed as in the proof of 

Theorem 3, □ 

Theorem 7 can be rephnised as follows* 

Theorem 8. Let /T be a closed, convex stU, (/y its support function* Then x 
belongs to K iff 

%) < ^(0 ( 19 ) 

for all / in X*, 

Exercise 14 . Complete the proof of Theorems 7 and 8 . 

Both Theorems 6 and 8 describe convex sets as intersections of half-spaces, open 
and dosed, respeciively. 

Definition, Let S be an arbitrary set in X, The closed convex hull ofS is defined 
as the intcrscclion of all closed convex sets containing S, 
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Thwjrem V. The dosed convex hull {]f any *<cT S \s the sot of points 叉 satisfying 
/(J) <^(0foral] MnX\ 

Exercise 15, Prove Theorem 9 . 

Let j] t ,,,, jc itw denote m points in X, and … .p at denote m non negative 

numbers whow mra is ]. 

m 

I 

Then 

is cdled u comhinotion pf … 

Exercise 16. Show that if :tj t ,.. t Jt^ belung to a convex set, then so Joes arty 
convex txjmbinaiinn of them. 

A pdn.l t>f a convex set A" ihut t\o\ m interior point is called a 
bon/uiary perini uf A r . 
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Pmof. Wc pruvG this indue lively on the dimcn.sioiii uf X. Wt difilinguish two 
cases' 


111 K has tio interior points• Suppose K contains Lhe origin, which cian always be 
jrr^ng^d by shifting K iippri>priyi^ly. We cljttm Th^l K titwjs ntsl amtiiin n lin^iriy 
independent vectors; for if it did. the convex combination of these vectors and the 
origin would &ha belong K; but these puinls c^nsiituie uti simplex + 

fuEt of interior points Let m be lhe: largest number of linearly indepL'ndenE vectors in 
and lei x 卜 , ir K m be m linearly independent vectors Then in < n, ui>d being 
rnaximal, every other vccior in ^ is a linear cocnbination of jti , — This proves 
Chai K is (rorttilined in an M-dimcnsiunul subspacc t>f X, By lhe inductiun hypeMhesis, 
Theorem 10 holds ftir K. 

fiil A； has interior ptmts. Denoie by l：o the of all interior points of K. It h easy 
to show I hot ^ is convex fliul ihat 欠 (> is open. We claim thal K has knindsury poirls; 
fur sintic K is buunJed, any my isNuing frum any inlcrior puim af K inlcniccLii K in an 
in(er\ r ul ； since K is clnseti, lhe trther emlpnim a boundary ptiifit >■ of K, 

Let >■ be a boundary poinl of K. Wt apply Theorem 3 to 心 and v; clearly y does 
noi hclon^ Et> so lhe re is a tincar Kunclicinj.1 / such ihui 

；(v) = [ j, < I for all in (21) 


We c]»im chill l{x\) < I far ail jf] in K. Pick my inieriorpnint .in nf A": (hen a\\ [minis 
j on the open segmenl bounded by jt^ and are interior points of K, and so by (21 h 
/(,t) < L It follows thal al the endpoint < ! + 

Denote hy K[ ibe fid of ihosc poinLs .r ol K fur which /{j} — I. Being ihtr 
intersection of two ctosed. convex sets, h closed and convex; since A" is bounded, 
so (Equation (21) shows that y belongs to J：i，so is nonempty, 

Wc claim that every ejtlrcme point e of Ki is also an extreme poinl of AT; for T 
suppose that 


z + w 



z and tv in K. 


Since e bclangK lo 


/ = / ⑷ 


= M±iW 

~ 2 


( 22 ) 


Borh r and w are in K\ ^ we have shown before. /(;} and /(»v) are both than or 
equal to L Combining this with ^ 22 ), wc conduck l\m 


i{z)^t(w) = I. 

This puts both ^ anti w_ into K\. But since f is an. extreme point of K t .z - h. This 
proves lhal csHtith: points of AT] arc cxlremc points of AT. 
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Since 心 lies in a hyperplane of dimension less than n, it follows from the 
induction assumption that K\ has a sikHicicnl number of extreme poims, lhat is, every 
point in K\ can be written as a convex combinaiion of « extreme points of Since 
wo have shown thal extreme points of K\ arc extreme points of K, this proves 
Theorem 10 for boundary points of K. 

Let .Vo be an interior point of K r We take any extreme point e of K (the previous 
argument shows thal ihere are such things) and look at the intersection of Ihe line 
through xo and e with K, Being the imerseciion of two closed convex sets, of which 
one, K, is bounded, ill is intersection is a closed interval. Since e is an extreme point 
of K, e is one of the crul points; denote the other end point by v* Clearly, v is a 
boundary point of K. Since by construction jco lies on ihis intervaL it can be written in 
the form 


xo = py 十 {1 — /?)e, 0 <p < L (23) 

We have shown above lhat y can be wriuen as a convex combination of n extreme 
points of K. Setting this into (23) gives a representation of xo as the convex 
combination of (n + l) extreme points. The proof of Theorem 10 is com¬ 
plete. □ 

Wc now give an application of Carathcadory^s theorem. 

Definition, An n x n matrix S = is called doubly stochastic if 


(i) 

Oi) 

s ij ^ 0 

for all i % j. 


E = 1 

i 

for all / ? 

(24) 

(iii) 


for all i. 



Such matrices msc 9 as tlie name indicates, in probability theory. 

Clearly, the doubly stochastic matrices form a bounded, closed convex set in the 
space of all n x n matrices. 

Example. In Exercise 8 of Chapter 5 we defined the permutarion matrix P 
associated with the permutation p of the integers (1，as follows: 


p ij = 



if j = p(i) f 

othcn\ l ise. 


(25) 


Exercise l8* Verify that every permutation matrix is a doubly stochastic 
matrix. 
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Theorem 11 (Denes Kdnig ? Garrett Birkhoff). The permutation matrices are 
the cxlrcmc points of the set of doubly stochastic matrices. 

Proof. It follows from (i) and (ii) of (24) thtit no entry of a doubly stochastic 
matrix can be greater than I. Thus 0 < < L 

We claim that all permutation matrices P are extreme points; for, suppose 


卜 A + 

2 1 

A and B doubly stochastic. Ft follows that if an entry of P is 1, the corresponding 
entries of A and R hoth must be equal to I, and if an entry of P is zero, so must he the 
corresponding entries of A and B. This shows that A = B = P. 

Next we show the converse. We start by proving that if S is doubly stochastic and 
has an entry which lies between 0 and 1; 

0 < s hh < (26)qo 

S is nol extreme* To see this we construct a sequence of entries, all of which lie 
between 0 and 1， and which lie altematingly on the same row or on the same column. 
Wc choose j[ m that 


This is possible because the sum of elemenls in the /oth row must be = 1， and 
since (26 ) ⑻ holds. Similarly, since the sum of elements in the 力 st cokinin : 1, and 
since (26)oi holds, we can choose a rowso that 


0 < Sf i j i < I- (26)ji 

We continue in this fashion, until the same pusition is traversed twice. Thus a closed 
chain has been constructed. 


s hJk 一 




-^S 




&kh 
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We now define a matrix as follows; 

(a) The entries of arc zero except for those points that lie on ihe chain* 

(b) The entries of N on the points of the chain ure + 1 and — U in succession* 
The matrix N has Lhc following property: 

(c) The row sums and column sums of N are zero. 

We now define two matrices B by 


A + €N, eJV* 

It follows from (c) that the row sums and columns sums of A and B arc both L By (a) 
and the construction the elements of S are positive at aU points where N has a 
nonzero entry* K follows therefore that e can be chosen so small that both A arid B 
have nonnegative entries. This shows that A and B both are doubly stochastic* Since 
A ^ B y and 


ii follows thai S is not an exireme point. 

It follows that extreme points of the set of doubly stochastic matrices have entries 
either 0 or L It follows from (24) that each row and each column has exactly one 1. It 
is easy to check lhat such a malrix h a permulalion mislrix* This completes ihe proof 
of the converse. 口 

Applying Theorem 川 in the situation described in Theorem 11, wc conclude: 
Every doubly stochastic matrix can be written as a convex combination of 
permutation matrices: 

< P) ^ L 

Exercise 19 . Show ihal* except for two dimensions, the representation of 
doubly stochastic matrices as convex coinbinalions of permutation malrices is not 
unique. 

Caraiheodory's theorem has many applications in analysis. Its infinite- 
dimensional version is the Krein-Milman Theorem, 

The last item in the chapter i?i a kind of a dual of Cuntthcodorj ^s theorem. 

Theorem 12 (Helly). Let X be a linear space of dimension n over ihe reals. Let 
{^1 ”… Kpi) be a collection of N convex sets in X. Suppose that every subcollcction 
of n + 1 sets Khns a nonempty intersection* Then all K in the whole collection have 
a common point 
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Proof (Radon). We argue by induction on N, the number of sets，starting with the 
trivial situation N — n ^ L Suppose that N > n + \ and that ihc assertion is true for 
N — 1 sets. It follows that if we omit any one of the sets, say Ki ， the rest have a point 
X, in common: 

j 參 L (27) 

Wc claim that there arc numbers not all zero, such that 


[ 攀 , -= 0 (28) 

i 

and 

- ( 28 )' 

i 

These represenl /i + 1 equations for Ihe N unknowns. According lo Corollary 
(concrete version) of Theorem I of Chapter 3, a homogeneous system of linear 
equations has a nnatrivial (i.e.，nol all unknowns are equal to 0) solution if the 
number of ciquaiions is less than the ruimber of unknowns* Since in our case /j + 1 is 
less than N, (28) and (28/ have a nontrivial solulion. 

It follows from (28/ that not all cii can be of the same sign; there must be some 

positive one^ and some negative ones. Let us renumber them so that a h are 

positive, the rest nonpositive. 

We define a by 


a — 


p 



(29) 


Note that it follows from (28 / that 


N 

(7 = — ^ £l|. 
p+l 


(29/ 


We deline y by 





(30) 


Note that it follows from (28) and (30) that 



a 


N 

(UXi 




(30r 
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Each of the points x h i = i" "belongs to each of the sets KjJ > p. It follows 
from (29) that (30) represents y m a convex comhinalion of .tj, - * * Since is 
convex, it follows that y belongs to Kj for / > p. 

On the olher hand, each x^i ^ p + . ,N belongs to each KjJ < p. It follows 

from (29 / that (30/ represents y as a convex combination of 〜冲 i” * •Since Kj 
is convex, it follows lhat y belongs to Kj for j < p. This concludes the proof of 
Helly's theorem, □ 

Remark. Helly's theorem is noiHrivial even ia Uie one-dimensional case. Here 
each Kj is an interval, and the hypothesis lhat every Kj and K ； intersects implies that 
the lower endpoint 卬 of* any is less than or equal to the upper endpoint hj of any 
other Kj, The point in common to all is then sup or inf . or anything in between. 

RemarL In this chapter we have defined the notions of open convex set, closed 
convex sel T and bounded convex set purely in terms of the linear structure of the 
space containing the convex seL Of course ihc notions open, closed, bounded have a 
usual topological meaning in terms of the Euclidean distance. It is easy to see thal If 
a convex set is open, closed, or bounded in the topological sense, then it h open, 
dosed, or bounded in the linear sense used in this chapter. 

Exlrcish 20 , Show that if a convex set in a finite-dimensional Euclidean space 
is open, or closed, or bounded in the linear sense defined above, then it is open, or 
closed, or bounded in the topological sense, and conversely. 



CHAPTER 13 


The Duality Theorem 


Lc[ X be a linear space uver lhe real-、dim X - n. [is dual X f cunsisL^ of all lincaj 

runciinrs on X. [f X is rcprcscnlL'd by c^luriirn vcckins a of/i compcsncrils . x ltT 

then elcmeTits of X f are rraditionatly represented as row vectors | with n cnmpoiwnl^ 
^ E ,... ? $ w . The vdue of 占 al .f is 

奋 ■■¥]+■■■+• 〆 『 （1) 

II we rcgEird 芒 asi a I x n matrix and rv^urd .t as ah n x I nititrix h {L) is Ihcir matrix 
piDdud 

Let K be a subspace of Jf; in Chapter 2 we have denned ihearmihilator of l^as 
the set df a(E liriL-ur funciions ^ thui banish on K ehut 

= 0 fcr alE y in Y, (2) 

According u> Theorem 3 of Giapltrr 2 h ihc dj^l of X f is X itself, and acL-ordin^ lu 
Thetjrcm 5 there, the annihiJatorof Y 1 is Kitself. In words： if^x = Oforafl ^ in Y L , 
then X belongs to K 

Suppose Yifi the lin^ir spucc spunned by m given vectors y ( .., r .y m in 

X- Tliiii is^ Y cunsists of ull vectors y of ihc form 

fW^ 

F [ 哪 - 

I 

Clearly. ^ t>dongs to Y L iff 

(Vjl = 0、 J — l” S 4 ’ 


Linear Al^ebm and tls Apptk f atiom P Sevartd Edition, by ft.ier D. Lax 
Copyright i 3t)f)7 John Wiley H. Snn 、 Inc, 


( 3 ) 

⑷ 


mi 
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So for the space Y defined by (3)* the duality criterion staled above can be formulated 
as follows: a vector y ran be written as a linear combination (3) of m given vectors yj 
iff every ^ that satisfies (4) also satisfies fy = 0. 

We arc asking now for a criterion that a vector y be the linear combination of m 
given vectors v) wilh notmegative coefficients: 

m 

励 ， pj > o - (5) 

] 

Theorem 1 (Farkas-Minkowski). A vector y can be written as r a linear 
combination of given vectors yj with oormegative coefficients as in (5) iff every ^ that 
satisfies 

翁 0, j= I } , wi (6) 

also satisfies 

fy>o. (6) / 


Proof The necessity of condition (6/ is evident upon multiplying (5) on the left 
by $ To show the sufficiency wc consider the set K of all points v of form (5), Clearly, 
this is u convex set; we claim it is closed. To see this we first note that any vector y 
which may be represented in form (5) may be represented so in various ways. 
Among all these representations there is by local compactness one, or several, for 
which Pi as ^ possible. We call such a representation of y a minimal 
represenraiioH. 

Now let {z H } be a sequence of points of K converging to the limit z in the 
Euclidean norm* Represent each Zn minimally: 

^ ( 5 )' 

We claim that Pnj — is a bounded sequence. For suppose on the contrary that 
Pn — do. Since the sequence z n is convergent, it h bounded; therefore z n /P” lends lo 
zero: 



P, 


E 


/V, 




0 . 


(5 广 


The numbers p, u j/P n are nonnegativc and their sum is 1, Therefore by compactness 
we can select a subsequence for which they converge to limits: 
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These limits satisfy 冗免 =L【t follows from (5)" that 

Subtract this from (5f: 

-n - ~ 免)办 

For each j for which qj > 0, ""j — DC ； therefore for " large enough* this is a positive 
representation of showing that (5 ) ； is not a minimal reprcsenlatioih This 
contradiction shows thui the sequence P fl — J^p^j is bounded. But then by local 
compactness wc can select a subsequence for which p n j —* pj for all j. Lei n tend to 
co in (5) r ; wc obtain 


z = lim^ = 


Thus the limit z can be represented in the form (5 )； this proves that the set of all 
points of form (5) is closed in the Euclidean norm. 

We note lha( the origin belongs lo K. 

Let y be a vector that does not belong to K, Since K is closed and convex, 
according to the hypeiplane sepamlion Theorem 7 of Chapter 12 ihere is a closed 
IkiI (space 


^ (7) 

that contains K but not v: 

W < c, (8) 

Since 0 belongs to K, it follows from (7) that 0 > c. Combining this with (8), we get 

W < 0- (9) 

Since kvj belongs lo K for any positive conslant k, it follows from (7) that 

kw ； > c, j = l，". ■川 

for all k > 0 ： this is the case only if 

明 >0， 』*=1""，川， (10) 


Thus if y is not of form (5), there is an jj that according to (10) satisfies (6) but 
according to (9) violates (6)' This completes the proof of Theorem L □ 




Show Lhiil K dclin^d hy {5) is- a convex scl. 


We reformulate Theorem I m matrix language by dclininglhc n x m matrix V as 

Y = (yi»— ， i ^ib )， 

lha.t if>, ihc matrix whasii cdunrns arc » We dt-[U)L£ the o>luifln vettmr r^rmed by 

Pli •■•■■> tPhi by P- 


Wc shall call a vector, column or row T nonnegalive, denoted as > 0 + if all its 
componerls sire ronnegalive. The in^qiniliLy jr > ： meaii 髯 ^ > l} r 

Ekercise 2. Show Lhal if jt > i ； and f > 0, then ft 3 紅 

T heorem l\ Given an x m mairi* Y, a vector y with n components can be 
in the form 


y-^P^ > 0 


iff every mw vectew ^ ihul satisfies: 


also satisfies 


For the proof, we merely observe ihiit (U) is ： ihe 仰 me as (5) + (12) the sjime as (6), 
artd <12V the same as |6)\ O 

The following ts a useful extension. 

ThtMirtm 2. Given on n x in mLiirix Y and a ccslunm vtxior v with n 
componenls. the jinequality 


y>^p r P>0 
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ulsu fUltisflUK 


ly>0. 


( 15 ) 


Pnnifr To prove necessity, mulliply {1by 老 on the left and use (14) to deduce 
(15). Conversely by dctiniliun of > 0 fur vecttjf?i + (13Ji ntcan^ ihm Ihcrc is aculimin 
veettit ; with n t'oitipun^nts su^h ihut 

y = Yp + 1 z>Q t p > 0 , (13 / 


Wc rewriie {13y by iniroducing Lhe n K n iJcrtlity mnirix [ H ihc au^tisiilcd 
mairix (Y. I) untl ihc ^u^mcni^d veCtoT ( 。 „ [n terms of !hesc t ■ 3 / c j an be wrin^n jk 


y = OU) 

mud (14} can be written an 


» C) 




(13} n 


狀 I) >0 


(H/ 


Wc now apply Thwircm 3^ to the irngmuiticd nutria and vector Ut dixluuc ihat if 
(15) is satisfied whenever (14/ h, ihen (13 广 has a solurion. as asserted in 
Theorem L O 


Theorem 3 (Duality Theorein >, Let Y be a given n ^ m matrix, v a given 
column vetlcjf with n ™mpi>nenlN, and y ei given row vector wilh m dHnpmicm». 

Wq deHnc I wo quantities S an4 j, as Follows; 

Definition 

S = sup y 尸 
F 

ftff all column vectors p with m t；DmponcnlK salisfj'ing 
> f > Vp, p > 0. 

Wc call the set of ^ satisfying (17) admissible for the sup problem (16). 

s=M^y (]R) 


⑽ 

( 】 7) 
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for all row vectors % with n components satisfying the admissibility conditions 

Y < $ ( 19 ) 

We call the set of ^ satisfying (19) admissible for the inf problem (18)* 

Assertion. Suppose that there are admissible vectors p and then 5 und s ure 
finite，and 

S — 5. 

Proof. Let p and ^ be admissible vectors. Multiply (17) by ^ on the left, {19) by p 
on the right. Using Exercise 2 wc conclude that 

_y > 0P > yp- 

This shows that any yp is bounded trom above by every $y; therefore 

s > S\ (20) 

To show that equality actually holds, it suffices to display el single p admissible for 
the sup problem (16) for which 

yp > ( 21 ) 

To accomplish this，wc combine (17) and (21) into a single inequality by augmenting 
the matrix Y with an c\\rd row 一 ） and the vector y with atl extra component —st 

OO ， 〃之(). ( 22 ) 

If this inequality has no solution ， then according to Theorem 2 there is a row vector f 
and a scalar a such that 

- ° J ^ - ( 23 ) 
but 

(“)0 <0. (24) 

Wc claim that of > 0; for, if a — 0, tlicn (23) implies that 


?Y>0 5 f>0 ? 


(23/ 
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j jnd (24) [hat 

办 <0. (24/ 

AccorJirL^ to ihc l, only if T pan of Theorem 2 this sJiows ihaU ■ 3)* the Kame m (: 17)+ 
cannot he satisfied: ihis means thai ihcre h no admi 沾 ihle/\ contrary to assumpiiofi, 
Having shown ihnt a is necessarily positive, we may, because of ihc homo^ciciry 
of {23( and (24), tukc a I. Writing uiit these incquulicicN gives 

tv> Yr e>n (25) 

and 

&<St (26) 

Inequality (25), the same as (19), shows that | is admissible; (26) shows ihat s is 
nul the inrimum ^ cuntrddiclion we got inti) by denying thiil wc can satisfy (21；. 
Therefore (21) cun he satisfied ； (his impli^N thal equality holds in (20>. This proves 

that S — s. □ 

BxtijitniiP, 3 , Show thal the sup and inf in Thenr^m 3 is u rru^irmim and 
minirmim. \Hinr. The sign of equaliE> holds in (21 j.| 

Wc give now an Qprplic：ation of the duality theorem in ccanomics- 
Wq are keeping irack of n different kinds of food (niilL mca| T fruit, bread, eicj 
and m different kinds of nulricnts (protein, fal. carbohydrates, vitamins, elc.), Wc 
denulc 

y-^ - number of units of the jth nutrient present irt one unil of the rth food ilem. 
Yf = minimum daily requiremetu of the jth nuirient. 

>V = price of one unit of the rth food irem. 

Nulc lhal all these qjaniilicsi urc nonhr^Eilive, 

Suppose our daily food purch&w ennsisu of units of the "h food stem. We insist 
on sutisfying aU lhe daily minimum requirements： 

^ > Vi ， / = i. m. (27) 

This incquuEily can be sulisfled, provided ihuT eat-h nutii^nl js presunl in ul luasl 
of the RhhIs. 

The lota.l cost of the purchase is 
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A natural quesiion is, What is the minimal cost of food ihai satisfies the daily 
minimum requirements! Clearly, this is the mininuim of (28) subject to (27) and 
^ > 0, since we cannot purchase negative amounts. If we identify the column vector 
formed by the j, wilh y\ the row vector formed by the yj wiih y\ and ihc matrix y,j 
with Y，the quantity (28) to be minimized is the same as (18), and (27) is the same as 
(19), Thus the infimum s in the duality theorem can in this model be identified with 
minimum cost. 

To arrive ai an interprelaiion of the suprennw} S we denote by {pj} a possible set 
of valuer for the nuirients that is consistent with the prices, Thai is, we require ihtu 

yi > ^2 Yi J p ^ /=1 ， …具 (29) 

The value of the minimum daily requirement is 

Y^yjPr ( 30 ) 

Since dearly pj are nonnegative, the restriction (29) is the same as (17} t The quantity 
(30) is the same as that maximized in (16). Thus the quantity S in the duality theorem 
is ihe largest possible value of the total daily requirements consistent with the prices. 

A second application comes from game theory. Wc consider two-person, 
deterministic, zero-sum games. Such a game can (by definition) always be presented 
as a matrix game, defined follows ： 

An n x m mairix Y, called the payoff matrix, h given. The game consists of player 
C picking one of ihe columns and player R picking one of the rows; neither player 
knows whal the other has picked but both are familiar with the payoff matrix. If C 
chooses column j and R chooses row L ihen the oulcome of the game is the paymenl 
of the amount Yy by player C to player R. If Y ( j is a negative number, then R pays C. 

We think of this game as being played repeatedly many times. Furthermore, the 
players do not employ the same strategy each time, that is, do not pick the same row, 
respectively* column, each time, but employ a so-called mixed strategy which 
consists of picking rows, respectively columns* at random but according to a set of 
frequencies which each player is free to choose* That is. player C will choose rhe jth 
column with frequency xj, where x is a probabiliry vector, lhat is, 

Xj > 0 , 〉〉 / — 1 * 

/ 

Player R will choose the rth row wilh frequency i] b 

> 0 , = 1 . 

I 

Since the choices arc made at random, the choices of C and R are independent of 
each other It follows lhat the frequency with which C chooses column j and R 
chooses row i in the same game is the product rtjXj. 


( 31 ) 

( 31 / 
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Since the payoff of C to S is Y)，the average payoff over a long time is 

u 


In vector-matrix notation that is 


(32) 

If C has picked his mix a - of strategies, then by observing over a long time R can 
determine the relative frequencies that C is using, and therefore will choose his own 
mix t] of strategies so (hat he maximizes his gain: 

max (33) 

Suppose C is a conservative player, ihni C anticipates that R will adjust his mix so 
as to gain the maximum amount (33)* Since R f s gain is C^s loss, C chooses his mix,v 
to minimize his loss — that is, so that (33) is a minimum: 

min max (34) 

T tf 


x and Tf probability vectors. 

If, on the other hand, we suppose that R is the coiiscrvntivc phiycr t R will assume 
that C will guess mix i} first and therefore C will choose .v so that C f s loss is 
minimized: 


min j]Yx. (33)’ 

x 

R i here fore picks his mix so thar the outcome (33 / is as large as possible: 

max min r/Yx. (34)' 

n 上 

Theorem 4 (Minmax Theorem). The minmax (34) and the max min (34 )’， 
where j] and x are required to be probability vectors, are equal: 

min max })Yx — max min (35) 

X 1 } ff x 

The quantity (35) is called the value of the matrix game Y, 

Proof, Denote by E ihe n x m matrix of all Is. For any pair of probability veciors 
jj and x, rj Ex = I. Therefore if we replace Kby Y + kix we merely add k to both (34) 
and (34)’. For k large enough all entries of Y + kE are positive; so we may consider 
only matrices Y wilh all positive entries. 
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(36) 


(37) 

has posilive admissible vectors p. Since Ihc entries of y are positive, S > 0. We 
denote by p 0 a vector where the maximum is achieved. 

Since Y > 0, the minimum problem 

s - irnnfv, ?Y > K, |>0i (37 / 

has admissible vectors We denote by a vector where the minimum is reached. 

According lo (36), all components of y tire 1; therefore ypi) is the sum of the 
components of p[\. Since yp { \ = S f 
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We shall apply the dualily theorem with 

y = (1 ， J) and y = 

Since y is positive, the maxi mum problem 

S — max yp^ y > Yp, p>0 





J 


(38) 


is a probability vector. Using an analogous argument, we deduce that 



(38/ 


is a probability vector 

We claim that % and are solutions of the minmax and max min problems (34) 
and (34)' ， respectively. To see this，set po into the second part of (37), and divide by 
S. Using the definition xq = p^/S, we gel 

I > (39) 

MuHipIy this nn the left with any probability vector j;. Since according to (36) all 
components of v are 1 f rjy = 1, and so 

办 > (40) 


It follows from this lhai 
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from which it follows that 


- > min max t)Yx. 

S X 1( 

On the other hand, we deduce from (40) that for all i;, 

min rjYx t 
S x 

from which it follows that 

^ > max min r]Yx. 

S 0 X 


(41) 


(42) 


Similarly we set | 0 for ^ into the second part of (37) f * divide by s, and multiply by 
any probability vector a. By definition (38/ T — since according to (36) all 

components of y arc 1, yx = 1. So wc get 

(4oy 

s 

From this wc deduce that for any probability vector a, 

max 

n = s 

from which it follows that 

min max ^Yt > — . (4I) / 

X I? "" S 

On the other hand, it follows from (40)’ that 

min f?oYx > - . 

.X — «s 

from which it fallows that 

max min t)Yx > — . (42/ 

X S 

Since by ihe duality Lheorem 5 = (41) and (41/ together show that 

min max tffx = - — 
x ti s S 
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while (42) and (42)’ show that 


max min rjYx —-=—. 
i? -v s S 

This proves ihe minmax iheorem. □ 

The minmtix theorem is due to von Neumann, h has important implications for 
economic iheory. 



CHAPTER L 4 


Normed Linear Spaces 


[n Chaplcr JZ Thcnrcm 7. wc saw thal every open^ convex set Af in a linear space X 
hvct K ennLaining [he origin can he dc^tirihed aK the sec of vccimx .v satisfying 
< 1. where ik the gauge funcrinn of fC. h a subadditive. positive homfigerwnus 
functiun, positive except al iht origin. Here we toitHider such functiuns with one 
iiddiliuna] pnjpcrty: evenness, Ehiit is^ —,t) = Suth a lum:li{m is calked a 
tiorrn, and is denoted by the symbol |.r|, the same as absolute value. We \hl now (he 
profxrties of i\ norm: 

(ij Positivity : |j| > 0 forjr ^ 0, |0[ = 0, 

(沿 SubuddUiviiyi |jt 4- yj < pf| + \y\. (1) 

Uii) Homo^eneily ： for any real mimberit, |ibt| = |fc||jr|. 

A linear space with a norm is called a ttonwd linear space. Except for Thecffem 
4, in this, thapier X denotes a linire-dimensiofol nonnetl linear spsice. 

Definition. Tlic sei of poinis jv in ^ satisfying | j | < ■ is calked ihc opt，n unit 
httli around the origin; ihc ser |.r| < i is called the t hsed unit hall. 

Exohse i, {a} Shtyw thsii Ihc open and cl used ut\h balls arc convex. 

(b) Show Ehiii ihc open and dnsed uni l halls; arc symmetric with respect in ihc 
origin, rhm if belongs ui the unit bull *k> doe ?； -x, 

Defirjitian, T^ie distmic? of |wn veciftrs x and y in X is ddined 狀 


linear A {辦 bm tmd lix Apptic^imf^ Setmul Edilkm^ by Peter D_ Lux 
Ccipyri^ht I 21X)7 John Wiley &. Smi 、 Ini：. 
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Exercise 2* Prove the triangle inequality ，that is, for all x, y, z in X, 

|x — < \x — j| + [v — zl (2) 

Definition. Given a point y and a positive number r, the set of .v satisfying 
\x — j| < r is called the open ball of radius r, center y; il is denoted B{y f r)- 


Examples 




(a) Define 


Wcc = \ a jl ⑶ 

Properties (i) and (iii) are obvious; property (ii) is easy to show. 

(b) Define |.v|^ as the Euclidean norm: 

W2 = (E1 巧 I 2 ) • (4) 

Properties (i) and (iii) are obvious; property (ii) was shown in Theorem 3 of 
Chapter 7* 

(c) Define 


Mi = ■ 



Exekcise 3 . Prove that |x|, defined by (5) has all three properties {1) of a norm* 
The next example includes the first three as special cases: 

(d) p any real number, 1 < p; we define 

M P = (51 K) • ⑹ 

Theorem L \x\ p defined by (6) is a nornu that is, it has properties (1). 


Proof. Properties (i) and (iii) are obvious. To prove (ii), we need the 
following: 


Holder f s Inequality. Let p and q be positive numbers that satisfy 
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Let (£ii ”.., a Jt ) = a and (/ ，卜 b n ) = y be two vectors; then 

< M p |y^ ⑻ 

wliere the product xy is defined as 

xy^^ajbj ； (9) 

|x | ；)1 \y\ q are defined by ( 6 ), Equality in ( 8 ) holds iff \cij\ r and are proportional 
and sgn aj — sgn hjJ — h … 

Exercise 4. Prove or look up : t proof of Holder's inequality. 

Note. For/j = q = 2 $ Holder's inequality is the Schwarz inequality (see Theorem ], 
Chapter 7). 

Exercise 5 , Prove that 

K = lim |爿 

P—OQ 1 

where |x|'_ k defined by ( 3 ). 

Corollary, For any vector x 

㈤ =max av ( 10 ) 

l> 1 〆 


Proof. Inequality ( 8 ) shows that when |v|^ = hxy cannot exceed |x|". Therefore 
to prove (10) we have to exhibit a single vector 灿 | 仰 | (| = I T for which ajq = |x| /p . 
Here it is: 


Clearly 


> r o = 


wr 


and 


z = (q，" ,c n ), cj = sgnaj\aj^ q . 


)’(4 好 = 


K fq 



( 12 ) 


kl 卜 E k/ = E K = WJ* (12)' 


Combining ( 12 ) and ( 1 2) 1 


IM 


Y ph 

K n 


X 


M 

P 


(13) 
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From (11) 


XVq 


« Ei~r_ 


r /* 
% 


r im 


ylP/^ 

^ P 


r |P P/^ 

K l) 




⑽ 


where wc have used (7) to set l + pjq ^ Formulas (13) and (13)’ complete the 
proof of the corolliiry. O 


To prove subadditivity for \x\ p we use the corollary- Let x and z be any two 
vectors; then by (] 0), 


|je + z\ n — max (a* + z)v < max xy + max 
p W,-i W〆 W,=J 


以 H 斗十 14. 


This proves that the F norm is subaddidve* □ 

We return now to arbitrary norms, 

Definition. Two norms in a finiie-climensional linear space X. |x|j and \x\ 2 , ure 
called equivalent if there is a constant c such that for all x in X, 

Wi 5 4 屯 l x \l ^ ( I4 ) 


Theorem 2. In a finite-dimensional linear space, all norms are equivalent; that 
is, any Iwo salisly (14) with some c\ depending on the pair of norms. 

Proof, Any finite-dimensional linear space X over U is isomorphic to R 11 , 
n — dim X; so we may lake X \o be Jt, In Chapter 7 we introduced the Eudideim 
norm: 

1/2 

, x = (au^.,a n )^ (15) 

Denote by ej the unit vectors in R n : 

ej = (0, … ， I ， 0 - . 0 ) ， j — I ? …， 

Then x = ( 川 … . ， a n ) can be wriuen as 

x = ( l6 ) 

Let |^| be any other norm in R JI , Using subaddilivity and homogeneity repeatedly 
we get 



k! < 


(]6 )， 
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Applying ihc Schwarz int:qu£ktiiy lu (16) r (see Ttitorem L Ch^iier 7) t wt! j^cE h using 
(15), 

wheni c abbn;viatt：s kj!") l/ ■- This gives one half of incquulili^ (M). 

To ^ct the uihcrhiilfi we show Hrs【thal jj[ is a cumin juus 「unuiiun wilh rasped lo 
the Euclid^tin disiance. By suhadditivity, 

W < k-yl + lyf. [yj < + [*|, 

from which deduce ih^ic 

M-\y\\<\x-yl 

Using inequality {17}. we get 

\\^\-\y\\<^\\^-y\l 


which shows that \x\ is a contlnunus function in lhe Euclidean nnrra, 

1 1 was stinwn in Chapier 7 thill ihe unit sphere S\na finile-dimensional Eudidean 
space, || jf || = I, is a compact set. Tlierctbre ihe conliituaus function |t| achieve ils 
minimum on 5. Since by {II |jt| is positive ul every point of S, il fotlowii ihiit ihe 


minimum m in positive. Thus we c»ndj(Jc thal 

0 < m < [a ：| when J|jt || - L (IS) 

£incc both \x\ and || .[ j tin: humogdrttuiisi fiiitctiuns, we ctmdiKkr lhac 


mlt ： ^ll<ki 


㈣ 


for all x in W 1 ， This proves the second half of the inequiitili^ 《 I4j, and prcives. ihn 
any norm m R" h equivalent in the sense of (14) with the Euclidean norm. 

"Hie tiotion of equivalence is tnmsiitve: if \x\ } and \x\ 2 are bolh equivaknl lo the 
Eiidicic^n nom t then they are 叫 uivjikm tu tatii oihcr. This completes the proof (]f 
Theorem 2 r □ 


Definition. A sisqucncc ■{.b} in a nurmed Itmcur space is called convergetil !□ 
the limU denotetl as Um 知 =j if —jc\ = 0, 

Ofeviijuiily. the notion of conver^etioe of ^equenct：!$ h elie same with reaped Lo two 
CcjuivulcftL nijrmii; so by Theorem 2 1 it is ihc sumc for any two nurms. 

fkffinitiott. A set in a nonned iincM space is called ebseti if it conlain% dhc 
limits of all convei^enl &ec|kiences in S. 
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EXEkfliil ： 6. Pravc that every subspacc of a Unite-dimensitmul ntirmctl linear 
space is closed, 

Dejinitian. A »ci ^ in a line；or space h called hnumied if it cuntainud 

in some hall, that if ihere is an R such that for all points z in S. |*| < ft. Clearly, if a 
is bounded in (he sense uf one norm, it h bounded in the sense of any equivalent 
norm, und by Theorem 2 for ull norms, 

Dejinitian. A sequence of vectors {jfi ]■ in a normod linear space is caEEcJ a 
CtfwWay if |jt* - Jtj icnd^ to iqto as ii find j lend to infinity. 

Theorem 3. (i) In a finitc-dimcrn^iDnal normcd linear space X, every Caucliy 
sequence converges lo a limiE. 

(U) Every bounded in Unite ^qti£iu；e in a nurnicd linear 

spw:? X has a convergent subsequence. 

Pnoperty (i) of X is called mmpleteness, and property (ii) is called tocat 
compaclnes!s. 

Pmof. (0 Introduce a Euclidean structure in X. According to Theorem 2, ihe 
Euclidean norm und ihr nurm in X arc cquivaJcnE. Thtrcftinc a Ciiuchy scqucnLX ia 

Lhc nsjnti of X is also a Cauchy sequence in ihc Buclidcan dorm. Accurdiiig lo 

Theorem in Chapter 7, a Cauchy sequence in a iinilt-dimcnsiional Gudide^n 
space canverj»cs. Bui then ihe frequence also converges in the norm of X. 

(U) A Kccjjt^ncc {X,} ih^t is bounded in ific nurm, of X is also bounded in ihc 
Euclidean rk.>rm imposed on X- According to Theorem I 办 ctf Chapter 7, it c<mtains ii 
Niub^qucme (h;it cnnv^Tgu in the E^tUdean mwn, Bin then ihai tiubscquence also 
convejrges in ihe norm of X. □ 

Juke as in Eudidean space, see Tlieorem 17 in Chapter 7, pan (ii) of Theorem ^ 
hus a converter 

Thmrem 4. Lei X be a nnmied lineqr space |ha[ is locally compcucT — thal is, in 
which every bounded sequence has a convergent subMjqucncc. Ther X is finite- 
diinensmnaJ. 

Pmof. Wc need the following result. □ 

Lt'inma S. Lei F be £i (inlie-clijn^risi^nEii sutH>p&oe at a nomittl linear space X. 
Let jt be a vector in X that docs nul belong m K Then 

h positive. 
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Proof. Suppose not; then there would be a sequence of vectors {y n } in Y such that 

lim |jr — y n \ = 0* 

In words, y fl tends to x, Ii follows that {y n } is □ Cauchy sequence; according (o parr 
(i) of Theorem 3, y n converges to a limit in K This would show that the limit of 
{y n } belongs to Y, contrary to the choice of x. □ 

Suppose X infinite-dimensional; wc shall construct a sequence {vk} in X with the 
following properties: 


\} f n\ < 2, \y k ->v| > I for k # (20) 

Clearly, such a sequence is bounded and, equally clearly, contains no convergent 
subsequence. 

Wc shall con struct the sequence recursively. Suppose 士 have been 

chosen; denote by Y the space spanned by them. Since X is infinite-dimensionaK 
there is an x in X that docs not belong to K Wc appeal now to Lemma 5, 

d = inf |jc - y\ > 0 + 

v in Y 


By definition of inlimum, there is a vector vo in Y which satisfies 


|x — s y 0 | < 2"+ 

Define 

>^+i = ― j— (21) 

a 

It follows from the inequality above that |v/h-i| < 2* For any y in K yo + dy belongs 
to Y. Therefore by definition of infuruim, 

|a: - yo - dy\ > d. 

Dividing this by d and using (he definition ol )vm, we gel 

!>Wi - >1 > •- 

Since every 外 l — 1” …，《， belongs to K 

\y n ^t — ^| > 1 for/ = 1 ，， ，• 

This completes the recursive construction of the sequence {v*} with property 

( 20 ), □ 


Theorem 4 is due to Frederic Riesz. 
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Exercise 7 , Show that the infimum in Lemma 5 is a minimum. 

We have seen in Theorem 5 of Chapter 7 thm every linear function / in a 
Euclidean space can be written in the form of n scalar producl /(x) = (,v.y). 
Therefore by the Schwarz inequality. Theorem 1 of Chapter 7 t 

m\<\\x\\\\y\\. 

Combining this with (19), we deduce that 

KWI < c\x\, c = iAl # 

m 

We can restate this as Theorem 6 , 

Theorem 6, Let X be a finite-dimensional normed linear space, and let / be a 
linear function defined on X. Then there is a constant c such that 

KW) < (22) 

for all x in X, 

Corollary 6\ Hvery linear function on a finite-dimensional normed linear space 
is continuous. 


Proof] Using the linearity of / and inequality (22), we deduce Chat 

|/{^) - /(> 0 | = \l{x - y)\ < c|jt - y\. □ 

Definition. Denote by Co the infimum of all numbers c，for which (22) holds for 
all x. Clearly, (22) holds for c — cq, and cq is the smallest number c for which (22) 
holds; Co is called the norm ol' ihe linear funciion /, denoted as |/|' 


The norm of / can aho be characterized as 



m\ 

TP 


(23) 


It follows from (23) that far all x and all 


m\ < 吖 14 


(24) 


Theorem 7. X is a finite-dimensional normed linear space. 


(i) Given a linear f unction / defined on X there is an x in x ^ 0, for which 
equality holds in (24). 

(ii) Given a vector x in X, there is a linear function / defined on X, l ^ 0, for 
which equality holds in (24). 
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Pmof. (i) Wc shall show ihui Lhc Kuprcmum dctinidun |23) of |/f a maKimum, 
We nore ihat l.he ratio |/(jc}J/|j¥| doesn't charge if we replace x by pny multiple of j. 
Therefore it suffices to fake ihc supremutn (23) over ihc unit sphere |j] = I. 

According lo Corollary b\ /(j) h a, conlinjou^ tunciion: ihen s* is |^(.i：)|. Since the 
space X is cnmpd liie coiilinunuK fynitlitin |#(-T)| (ukes on its mmimum 

value at some point jt of the unit sphere. At this point, equality holds in (24). 
fil) If j = 0 K an^ / will do. 

For jf ^ 0, wtr define / { jt) = [j| ; since / is Si near, we scl for any staler k 

(25) 


We appeal jww lo ihc Hahn-3Theorem h Theorem 4 in Chapter 12. We choose 
ihc ptwitivc hniru>gcnenus, subadd iiivc funcii{>n/^(.T) lo be ]j| + and ihc subspacc U »n 
^hicti / \n deliucd ut' Lilt ciuilliplc^ uf .t. Ie filll»ws fitml {25| EhaT for :iH u in 

U. i(tt) < |^[. According lo Hahn-Banach, / can be exierded uv all v of X w thal 
^tv) < [v| for alJ v. Sctlitig -y for y, we deduce iKal |%)| < \v\ as well. So by 
definition (2.1) of the norm of/, il follows lhat j/| f < I, Since /(jc) - |je] + it fallows 
thai ]/f = I. m equulit) 1 holds in (24). 口 

In Chapter 2 wti have dclmcJ ihc dual uf a HnitCTdimcr^iumit linciir spuL：c X os 
the scl of aEE linear funclifms i defined on X. These runcrlionm fiHTn a linear spacer, 
denoted We have shown in Chapter 2 that the dual of X tan tw idealised with 
X iisdf: f = JT, ajs follows. For each .v in X wt； decile a Hneiir iunciiort/over by 
fading 




㈣ 


We h«ve shown in Chapter 2 rhal ihcsc arc all the linear functions an X. 

When is d linitiNdimien^iuhul iwtrmecl Hncar space 4 there is an intfuced norm 
in defined by formula (23) - This, in lum, induces ei flOitO ui ih? dual X* 1 iif X!. 

rhcortni H, The mnr\ induced in by the induced normi in )C is ihe as 
the origiital iiunii in X. 


PrtHff. Hie norm of a linear function of on Y is, according to formula (23K 


Iff 


(27) 


The Jenear functions f an X* are of the fom (26); setting this tnlo (27) pves 
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According to (24) ， |/(jr)|/|/j / < \x\ for all / 一 0. According to part (ii) of Theorem 7, 
equality holds for some L This proves that Iff — |x|, □ 

Exercise 8 . Show that |/| cklined by (23) satisfies all postulates for a norm 
listed in ("— 

Note. The dual of an infinite-dimensionol nomied linear space X consists of all 
linear functions on^ that are bounded in Ihe sense of (22), The induced norm on X is 
defined by (24). Theorem 7 holds in infinite-dimensional spaces. 

The dual of X f is defined analogously. For each x in X, wo cun define u linear 
function/by formulii (25);/is bounded and its bound equals \x\. So/lies in X n \ but 
for many spaces X that are used in analysis, it is no longer true that all dements /in 
X" arc of the form (26)* 

Part (ii) of Theorem 7 can be stated as follows: 


\x\ — max /(^) 

|/|=i 

(29) 

for every veclor x. 

The following is an interesting generalization oi (29). 


Theorem 9. Let Zbe a subspace of X, y any vector in A". The distance rf(y,Z) of 
j; to Z is deiined io be 

d(}\ z) - inf \y-zl 

zmX 

(30) 

Then 


d{}\ Z) = max l(y) 

(31) 


over all ^ in satisfying 

\lf < L !{z) = 0 for^InZ. (32) 

Proof By deHnilion of distance, for any c > 0 there is a z<> in Z such Lhal 

|j _ 如 | < ^()\ Z) + €. (33) 

For any / satisfying (32) we get* using (33) that 

m = liy) - Kzo) = l(y -Zo)<\l\\y — 如 | < d(y,z) + e. 

Since € > 0 is arbitrary, ihis shows Lhat lor all / satisfying (32)- 


i(y)< 


( 34 ) 
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To show the opposite inequality we shall exhibit a linear function m satisfying (32)* 
such that nt(y) — d{y\ z). Since for y in Z ihc result is trivial, we assume that ihc 
vector v dees not belong tu 2. We define the linear subspace U to consist of all 
vectors u of the lorm 

u = z + ky, z in Z, k any real number. (35) 

We define the linear function w(m) in U by 

埘 ㈣ = kd(y. Z), (36) 

Obviously, m is zero for" in Z; it follows from (35), (36), and the definition (30) of rf 
that 

w(ii) < |n| for w in U. (37) 

By Hahn-Banach we can extend m lo all of X so that (37) holds for all x; then 

Imf < L (37 )， 

Clearly, m satisfies (32); on the other hand, we see by combining (35) and (36) that 

m(y) -d(y,Z). 

Since wc have seen in (34) that /(> ? ) < d(y, Z) for all / satisfying (32)，this completes 
the proof of Theorem 9* □ 

In Chapter 1 wc have introduced the notion of the quotient of a linear space X by 
one of its subspaccs Z. Wc recall the definition: two vectors xi and X 2 in X arc 
congruent mod Z. 

^1 = X 2 modZ 

if X] — X 2 belongs to Z. We saw that this is an equivalence relation, and therefore we 
Qim partition the vectors in AT into congruence classes {}. The set of congruence 
classes 1 } is denoted as X/Z and can be made imo a linear space; all this is described 
in Chapter 1. We note that the subspace Z is one of the congruence classes, which 
serves m the zero element of the quotient space. 

Suppose A’ is a normed linear space; we shall 'show that then there a natural way 
of making X/Z into a normed linear space, by defining the following norm for the 
congruence classes ： 


j(}| = inf M ， 




(38) 
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Theorem 10* Definition (38) is a norm，that is，has all three properties (l) T 

Proof. Every member x of a given congruence cluss {} can be described us 
x = jto — z ,aq some vector in {] T z any vector in Z. We claim that property (i), 
positivity t holds: for {) #0, 

|{}|>0 _ (38/ 

Suppose on the contrary that |{}| — 0, In view of definition (38) ihis means th«l there 
is a sequence xj in {) such that 

lim ㈨ =(X (39) 

Since all xj belong tn the same class，they all can be written as 

xj ― Zj in Z, 

Setting this into (39) wc get 

lim \xo - zj\ — 0 . 

Since by Theorem 3 every linear subspace Z is closed, il follows lirdi belongs to Z- 
But then every point 鄭 一 2 in { } belongs to Z, and in fact {} = 2, But we saw earlier 
that {} = Z is the zero element of X/Z, Since we have stipulated {} # 0, we have a 
coniradiciion. that we got into by assuming |{}| = 0, 

Homogeneity is fairly obvious; we turn now to subaddivity: by definition (38) we 
can, given any e > (X choose xo and {,v}and y 0 in {y} so that 

ko| < |{x}| + e, Nl < \{y}\ + e- (40) 

Addition af classes is defined so that xq + yo belongs to {,v} + {y}. Therefore by 
deJinilion (38), subudditivily of | - | and (39, 40), 

K4 + {y} I < l^i + 3^)1 < \xq\ + !^>| < \{x} ； + \{y}\ \ 

Since € is :m arbitrary positive number, 

IW + MI<IWI + IWI 

follows. This completes the proof of Theorem 10. □ 

We conclude this chapter by remarking that a norm in a linear space over the 
complex numbers is defined entirely analogously, by the three properties (1), The 
theorems proved in the real case extend to the complex. To prove Theorems 7 and 9 
in the complex case, we need a complex version of the Hahn-Banach theorem* due 
to Bohnenblust-Szobcyk and Sukhomlinov* Here it is: 
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Theorem 11 ， X isa linear space over C, and p is a real-valued function defined 
on X with the following properties: 

(i) p is absolute homogeneous; that is* it satisfies 

尸 ㈣ = \a\p(x), 

for all complex numbers a and all x in X. 

(ii) p is subadditive: 

p{x + y) <p(x)+p(y). 

Lei be a subspace of X, and l is a linear funclional delmed on U that satisfies 

K(w)| < p{u) ( 41 ) 

for all um U. 

Then / can be extended as a linear functional U> the whole space so that 

I/(^)| < p(x) (41/ 

for all jt in X. 

Proof. The complex linear space X can also be rcgLirdcd i\ linear space over H 
Any linear funciion on complex X can be split into its real and imaginiiry part: 

l(u) = + 

where /| and h are real-valued, and linear on real U, l\ and 4 are relaied by 

/】("#) = —/ 2 («). 

Conversely, if I) is a real-valued linear runciion over real X, 

l(x) = h(x) - il\ (ix) ( 42 ) 

is linear over complex X. 

Wc turn now to the task of extending It follows from {41) thal /“the real part of 
I ， satisfies on U the inequality 

Mw) <p(u)^ ( 43 ) 


Therefore, by the real Hahn-Banach Theorem, l] can be extended to all of X so 
that the extended I is linear on real X and satisfies inequality (43). Define / by 



NORMFD LIN PAR SPACES 


227 


formula (42); clearly, it is linear over complex X and is an extension of / defined on 
U. Wc claim that it satisfies (41/ for all x in X. To see this, we factor _ as 

l(x) « t7r y r rctil, |tj| * L 


Using the fact that if/(y) is real, it is equal to l\ (v), we deduce that 


|/(jr)| — r * — /(a =l x) = /i(a _1 x) < — /?(x). □ 

We conclude this chapter by a curious characterization of Euclidean norms 
among all norms. According lo equation (53) of Chapter 7, every pair of vectors a, v 
in a Euclidean space satisfies the following identity: 

|| f /+H| 2 +ll — Hl 2 =2|| "f+2|| hi 2 . 


Theorem 12. This identity characterizes Euclidean space* That is t if in a real 
nonrted linear space X 

|w + if + |fi - v| 2 = 2 |ii| 2 -h 2|v| a (44) 

for all pairs of vectors \\ then the norm | | is Euclidean* 

Proof. We define a scalar protluci in X as follows: 

4{x,y) = |.v + y| 2 -[x-v| 2 . (45) 

The following properties of a scalar product follow immediately from definilion 
(45): 

(jf,x) = |x| 2 , (46) 

Symmetry: 

(y ， x) = (x,y), (47) 

and 

(x^y) = ^(x,y) (48) 

Next we show that (.v, )•) as defined in (45) is additive: 

(j£ + z,y) = (x,y) + (^.y). (49) 

By definition (45 )， 

4(x + z,y) = |.v + 4 + y\ 2 - \x + z — y| 2 . (50) 
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We apply now identity (44) four times: 
(i) u — .V + y, v — z ： 


l^t + y + s| 2 + |x + y 一 z| 2 = 2 |jt+ v| 2 + 2|i;| 


(50)： 


(ii) « = y + z, v = x: 


x + y+ z| 2 + Ij + z - ^!| 2 = 2[y + z| 2 + 2 |jc| : 


(51), 


(iii) u — x — y, y = z: 


卜 -y + z| 2 + - y - c| 2 — 2\x - v| 2 + 2 \z\ 


(51k 


(iv) u — z — y, v — x: 


z - )? + jr| 2 + |z - j - ^| 2 = 2\z - y| 2 + 2\x\ 


(5 0 


Adel (51 )j and (5I)& and subtract from it (51 )犯 and (5l) iv ; wc get, after dividing 
by 2 , 


I-* 1 + j + z\ 2 一 |x — y+ z|" 

=k + >1 2 - k- jI 2 + b + z| 2 - b — z| 2 


(52) 


The left-hand side of (52) equals 4(x + and the right-hand side is 
4(x 1t v) + 4(u), This proves (49). □ 

EXERCISE 9* (i) Show that for all rational r, 

(rx,y) = r(x,y). 


(U) Show that for all real 


(fo ， j) =k(x f y). 



CHAPTER 15 


Linear Mappings Between Normed 
Linear Spaces 


l et X and K he a pair of Irnjlc-dimcniktnnal normed linear spaces over the reals; we 
shall Ueitwtc Lhe nomi in both spiiocs by 3 I, alilioiigh ihcy have nu(hing to du with 
cuch (Jlhcr. The [irst IcmmLi thul every linear map of unc nurmt. L d {incur spucc 

into another is bounded, 

Ltvnma 1. For any linear mdp T: Jf — F, there i;i a cort^iant c such thal for all x 
iftX, 

|Tae| < c\x\, (1) 

Prtwf, fixprett x wilh respect ro u hask 

iheti 

Trs = Yi a ^ 

By properties of Ihe norm in K 

\^\ < 52 KII Tj /l- 

Fmra this wc deduce thtai 

|Tx|< k\x\^ {3) 
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where 

kU ^ \ a j\^ k = Jl M- 

VVe have noted in Chapter 14 thul | 1^ is a norm. Since we have shown in Chapter 14, 
Theorem 2, that all norms arc equivalent < const |.tj imd (1) follows from 
(3). □ 


Exercise i. Show that every linear map T: X Y is continuous, that is, it' lim 
= 寘 ， then lim Tx,t — Tjt, 

In Chapter 7 we have defined the norm of a mapping of one Euclidean space into 
another. Analogously, wc have the following definition. 


Definition, The norm of the linear map T: X — K, denoted as |7^ is 

|T|mp 背 . (4) 

Remark 1. It follows from (l) that |T| is finite. 

Remark 2. It is easy to see that |T| is the smallest value wc can choose for c in 
inequality ⑴. 

Because of the homogeneity of norms, definition (4) can be phrased as follows: 


|T) = mp |Tjt|, (4/ 

I 相 

Theorem 2, (T| as defined in (4) and (4) f is a norm in the linear space of all 

linear mappings of X into K 

Proof. Suppose T is nonzero; that means Ih^l for some vector Xq ^ 0, T^o -f 0. 
Then by (4), 


T|> 


\Txo\ 

"RP 


since the norms in X and Kare positive, the positivity of |T follows. 
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Tt> prove NithuddiTiviiy we iu]lc + using £4/. th[Ll when S iinti T arc iwxi mappings, ur 
X^V. then 


|T + S|-snp|(T + S)jKj < aupflTx [+ \Sx\) 

W=t W=i 

|Tj| + BupfjSs| = |T| + |S|. 

W=i 

The cmji of the argument i,s lhat rhe supreimnn of a fuiiclion ihiil is ihe sum (if iwo 
others i & lew than or equal to the sum of rhe separate supreirLfi of the two summands. 
Homogeneily is obvious; this t v ompleies ihe proof of Theorem 2. □ 

Given any mapping T from one linear space X into another Y. we explained in 
Chapter y Lhut thert is unoiher called the mmspose of T ami JenoLal as T, 
mapping K the duul ctf K into V, the dual of X. The (leaning relation hclwccn the 
I wo maps is given in cqualioTi (9j of Chapter 3: 


(n 扣 ㈣ ， (s) 

where .v is any vcclor in X ard I in any element R Thy scalar pniduct un the right 
{^ T y) T denotes the bilinear pairing of etemenl !； y of Y and I of Y. The sca^r product 

(rtr.x) on ihe left is the bi I incur paring of elements j in X and in in Aclaiion (5> 

defines T7 as an clcmeni of Jf. We have ncisd in Chapter 3 tfiat (5J is a syrametric 
relatinn helween T and V and ihal 


r = T. (6) 

im as X" i?? X and is K 

We have shown in Chapter 14 that there is a imtuni] way of irtfoducing a dual 
norm in ihe duu! iif a ntirmcd linear space see Theorem 7 ： for m in. 

|ni|' — sup (叫 j). (7) 

Hie dual norm for/in 3^ is defined similcirly sup{L>') 4 |v| 一 1: froan thi s defini lion* 
I we equal ion (24) of Chapter 141, it ft>llows that 

t^) < W M- (8) 

Theorem 1 Let T be a linear mapping from a normed linear space X into 
a neither narmeti lineal spac:c ¥, T its transpose 、 mapping T into Jf. Then 

\r\ - |T| t ⑼ 

whtse JT sind f are wiLh ihe dual norm^ 
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Proof. Apply definition (7) to m — T7: 

\tlf = sup(T^x), 

|和1 

Using definition (5) of the transpose, we can rewrite the right-hand side as 

|T7|’ = sup(ITt). 

I 水 I 

Using the estimate (8) on the right, with y = T,v, we get 

ir/r< su P |/f [ta|. 

|x|=l 

Using (4)’ to estimate |Tx| we deduce that 

ir/| < W iti. 

By definition (4) of the norm ol.T’, ihis implies 

m 引 ti. (io) 

We replace now T by Y in (10); we obtain 

rm in ⑽ , 

According to ⑹， T" = T» and according to Theorem 8 of Chapter 14, ihe norms in 
X ft and the spaces between which Y f acts, are the same as the norms in X and K 
This shows that ]T"| — |T[; now wc can combine (10) and (10/ to deduce (9), This 
completes the proof of Theorem 3, □ 

Let T be a linear map of a linear space X into Y, S another linear map of Y into 
another linear space Z, Then, as remarked in Chapter 3 T we c^n define the product ST 
as the composite mapping of T followed by S, 

Theorem 4* Suppose X, K and Z above are normed linear spaces; lhen 

|ST|<|S||T|. (11) 

Pwof. By definition (4), 


lS>1<|S|b|, \Tx\ < |T||^i 


(12) 
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Hence 

|SXv|<|S|lT^|<|S||T||4 (J3) 

Applying definition (4) to ST completes the proof of inequality (11), □ 

We recall that a mapping T of one linear space X into another is called invertible if it 
maps X onto Y, and /,v one-to-one. In this case T has an inverxe f denoted as T _1 , 

In Chapter 7, Theorem 15, we have shown that if a mapping B of a Euclidean 
space into itsdf doesn't differ tcx> much from another mapping A that is invertible, 
then B ? too, is invertible. We present now a straightforward extension of Ihis result lo 
normed linear spaces. 


Theorem 5. Let X and ybe rmile-dimensional normed linear spaces of ihe same 
dimension, and let T be a linear mapping of X into Y that is invertible. Let S be 
another linear map of X into Y that docs not differ too much from T in the sense that 


Then S is invertible. 


|S -T| < L 

k = - r 


丁 - J 


( 14 ) 


Proof We have to show that S is one-to-one and onto. We show first that S is 
one-to-one* We argue indirectly; suppose that for Ay ^ 



( 15 ) 


Then 

Txo = (T - S). 

Since T is invertible, 

ao = T Vt-sk 


Using Theorem 4 and (14) and that \xq\ > 0, wc get 


N< |T- 1 ||T-S||.v 0 |< |T 」 | 利 = 

a contradiction; this shows that (15) is untenable and so S is one-(o-one. 

According to Corollary B of Theorem 1 in Chapter 3, a mapping S of a 
linear space X into another linear space of the same dimeosioii that h one-to-one 
is omo. Since we have shown that S h one-to-one, this completes the proof of 
Theorem 5, □ 


Theorem 5 holds for normed linear spaces that arc not finite dimensional, 
provided that they are complete. Corollary B of Theorem I of Chapter 3 does not 



254 LINEAR ALGEBRA AMD m AFPUCATIONS 

hold in sp^tvis «f inlinilc dirriun^iun; thcitfunc we nucd u difTcrcnl. more dinx"! 
argumeol lo invert S. We now present such m argument. We start by nscalllng the 
nation of convergence in a nortned tinear space applied lo the space oflincar maps. 

Dejinition. Lei X_ ^ j pyir uf linile-djmeriKjonal norrnsd lin«ur Kpat'eN- A 
sequence (T,} of linear maps of X into Y is said to converge to the linear map T, 
dcnokid lim M ^, A T rt = T, if 


lim |T n - T 卜 0. 


06 ) 


Theorem 6, Let 义 he a normeci finite-ttimensjonal Eiiwar spate. R ^ linear map 
of X into itself whose norm is less than I ： 


Then 


|R|< L 

(]7) 

S -1- R 

(m 

S- 1 

(18 / 


is invurtiblc + jiik! 


^nmf. Lfcmxe as iiini denote T^x m y„. We diiim thal {y N } is a Caudhy 
Koquence; ihui ]y ri — y/l icndy to zero u» fi und t iciuJ m do. To soc this, wc wrik! 


)Vi - >; = T n x - TfJf = ^ R^. 


By the lhaiiigle ineqctiiEily 


- : 


09) 


Using repeatedly the mulliplicative property of the norm of operators, we conclude 

that 


ll follows thai 


|R*^I < |R*lkl < |Rf>|. 


Sei this cslimntc inro E19); we gel 

Lv fl ->'i\ < 




(20) 
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Since |R| is usKumcJ to be less lhim imc + Ehc righl-h^nd side ol ^ 20 ) lends lu zero iis, h 

n 

Jirtd j tend to cc『"Hiis shows \hn\ y fl = T„x - X] is a Cauchy sequence. 

According to Theorem 5 of Chapter 14. every Cauchy sequence in a finitc- 
diincisional nomned linear ^pacc has a. limil. We define the mapping T as 

Tjc = liill T^. (21) 

ft—»aj 

We ctaim thai T Ltic inverse <ifl - R. According tn EKCTcise | p ihc mapping 1 - K 
h e 邮 imi(niK; (herefone il I'olkKws from (21) ihiii 

(I - R}Tj = lim(] -R)T rt .r 

iq —*-30 

Since T„ ^ 53 R*. 

a 

H 

(J - R)T fl j = ( [- R} ^ = i - V’J 」 

(I 

□s a —► oo T the lefi-hand side lends to (I - R) Tj and ihe right-hand side lends to x ： 
Lhis proves that T the inverse of I - R. □ 

Ext-fu is^ i. Show that if For every x in X, IT# ■ Tx| rends lo zero as n — so, 

Ihcn jT*, — T| lends to 2Cfu. 

n 

ExehjCISE 3 , Show that R A oonveijes DoS 1 in the sense of dcfiniiinn^ 16). 

El 

Thcoitra 6 is a special of Theorem 5 + with F = X ^nJ T = L 

£xlk(.i&e 4 . Deduce Theorem. 5 from Theorem 6 by facloring S = T + S - T 
as T[I-T H (S-T)]. 

EsriMt lSE 5 , Show that Ttieorein f> remains true if the hypolhesis {17) is 
ncplsuied by ihe fdlowing hypothesis. For some posslive integer m, 

|R ffl | < L, (22) 

Exercise 6. Take X — Y = und T: X X iht matrix Tukc fur ihs 
norm \x\ ihe maximum norm lx\^ delined by formula (3J of Chapter 14, Show ihat 
I he norm |Tj of die matrix (jfy), regarded us a mapping of X imo X is 

|T| = m 严 [叫， 


UJ) 
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Exercise 7 * Take X to be W normed by the maximum norm Y to be R w 
normcd by the 1-norm 1^1,, defined by formulas (3) and (4) in Chapter 14. Show that 
the norm of the matrix fry) regarded as a mapping of X into Y is bounded by 

|T| 5 

ij 

Exercise 8. X is any finite-dimensional normed linear space over C, and T is a 
linear mapping of X into X. Denote by the eigenvalues of T ; and denote hy r (T) iis 
spectral radius: 

r(T) = max |//|. 


(i) Show that |T| > r(T). 

(H) Show that \T n \ > r(Tf* 

(iii ) Show，using Theorem 18 of Chapter 7, that 

lim |r| l/rt = r(T), 

u—00 



CHAPTER 16 


Positive Matrices 


Dejimtion. A real 』 x / mulrix P is called emvywise poxitive if ull its cntricN p t! arc 
piisiiEv^ mal numbers. 

This norion of positivity, used on]y in rhts chapter, h not (o be cnnftiscd 
wi[h scli-aJjoint matrices lhai are positive in ihe sense Chapier ICH. 

Theorem I (PerronEvery posilive n^lrix P has a {tommant ei 狀 
denoted hy X(P) which hits the following properties: 

⑴ A(P) is positive and The associated eigenvector h has positive entries ； 

P/i = X{P)h, h>0. (I) 

(ii) 义 (_P) is a simple eigenvalue, 

<Ui) Every uLhcf tigcnvalac k oJ' P is less；, than a(P) b absoliiEc value: 

m< nn ( 2 ) 

(iv) p has no mher eigenvector / wirh nnnegative entries. 

Pwof. Wc recall fr«in Chapter IJ Ihm inequality bclwccn vectors in If 1 mcuns 
lhat Ihe inequality htikls for sill corresponding componenis. We denote by /7(P) |hc 
set of all nonregfltive numbers X for which there is a nor negative vector x — Q such 
that 

?x >kx. jr>0, (3) 
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Lemma 2. For P positive, 

(i) P(P) is nonempty, and contains a positive number, 

(ii) /HP) is bounded, 

(iii) p(P) is closed. 

Proof. Take any positive vector x; since P is positive, P,v is a positive vector 
Clearly, (3) will hold for k small enough positive; this proves (i) of the lemma. 
Since both sides of (3) are linear in x, we can normalize x so that 

^ ― - h (4) 

Multiply (3) by f on the left; 

^Px> = X. ( 5 ) 


Denote the largest component of §P by b\ then Setting chis into (5) gives 

b > a; this proves p:irt (ii) of the lemma* 

To prove (iii), consider a sequence of a,, in /;(P); by definition there h a 
corresponding x n ^ 0 such that (3) holds: 

^ (6) 

We might as well assume that the x n are normalized by (4): 

— ] . 

The set of nonnegaiive x ft normalized by (4) is a closed bounded set in U ft and 
therefore compact. Thus a subsequence of x n tends to a nonnegurive jc also 
normalized hy (4), while X fl tends to k. Passing to the limit of (6) shows that x, A. 
satisfy (3); therefore p(P) is closed. This proves part (iii) of the lemma. □ 

Having shown that/>(P) is closed and bounded, il follows that ii has a maximum 
又 nwx: hy (i>, Aro ilx > 0, Wc shall show now that A, max is the dominant eigenvalue. 

The first thing to show is that a i1111x is an eigenvalue. Since (3) is satisfied by A jniLVT 
there is a nonnegative vector h for which 

Ph > Xmmh, h > OJl ^ 0 ： (7) 

we claim that equality holds in (7); for，suppose not, say in the kth component: 

^ ^ Pkjhj > 


⑺' 
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Define [he vector .v — h + r^. when? f > 0 end e k hets 又 .【h compt^ncni li> K ull 

olher components zero. Since f 3 is positive, repEacing h by x in (7) increases each 
componcru of the tcfi-hand side; Pj > P/i. But only the Jtth component oflhc righi- 
hMd side is increased when h h rtplficed by x. U follows theitfore from (7^ lhat for 
-r ermugh positive, 

P.r > 

Since this U a strict ineqyality, we may replace by 久取 „ +^> S positive hut so 
small that (S] still holds. This shows that X mm + S belongs (o /j(P), conirary lo itie 
maXEtnaS rhiirajctcr uf 入 This proves lhai A irUL1 in an cigcnvalut of Pand that Ihcru 
Ik u comejiponding eigenvector h [hjl is non negative. 

We daim now ihat the vector h is positive. For certainJy, since P is positive tuid 
/f > 0, it follows thal > 0. Since = 入 _/ 1 .办 > 0 follows. This proves psirt (i> 
ol: ThL-orcm 1 


Nesl we show thiiit A™x is simple, We observe ihai at( eigenvecu»rs taf P with 
eigenvalue must l>e proportional to/i ： for if there were anorher eigenvector y not 



y^bh if necessary we can make sure Lhal y h positive; it follows then from (9) and 
h >0 that Fv > Amosv. But ihcti for i small enough, greater than 0, 


Pv > ( 入 + 占 ) 

contrary lo being the large si number in 

To show part (iii) of Theorem I, let tc be another eigcn%alu^ of R not equal 
Lo y the corresponding eigenvector, biuh possibly complex: ?y = xy; 

cuinponeiiL^sic, 

jym = Ky” 

Usiing Lhe triuitglc inequujity fur complex numbcjSi ^nd their ahsuilute value 、 
wc get 

t H 離 ― 咖 I- w 

i i 
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CtmfKiring ihis wilh kcc that |jt| bckmgs tu /ifP). If |jf| Were — the 

vector 



wocild be »n eigciwector of P with eigenvalue and thus proportional to h\ 

1 >V| = ^^ ( 11 ) 

Furrhemiorc. the sign of i^qu£iility would hoid in {L0|. It is welE known about complex 
Tmmhcrs thal ihis is lhe case onl^ if all ihc y, have ihc seiitic temples Etrgurrent: 

yi - ^l>v|, 

Comhtning this with t J U we see that 

Tj ~ ce^h,, ihal ifi, y = (rf 

Thus if = yntl the pm«f of pan (Mi) is complete. 

To prove fiv) we recall from Chapter 6, Theofem 17^ tliat Ihc ptoduct cf 
eigenvectors of P und iis tniri!ifK>sc P r pertaining to dEfferent gigchvalucN is /.em, 
Since P 7 " also is pmitivc. the eigenvector I pemining to \n donnmmit eigenvalue, 
which h the same as ihat of P. has pofsiUve entries- Since a positive veclor fdoes nol 
annihilate a nonne^ative vector/, part (iv) foJtows from |/ = 0, This completes (lie 
pmof of Tlicorcm L □ 

The above proof is due lo Botuienblusi; see R. Intniciiiciion to Mutrix 

Analyitis. 

Ext^nsi ： i. Detune hy f(P) (he oi iwnnegaiive 〆 such lhat 
Pi < Xx, t>0 

for some vector x ^0. Show thal ihc docnirunt eigenvalue A.(P) saEislies 

明 = 恶， ⑽ 

We ^ivc nt>w upplicatiuiis ul L Ftrnm's ihet>rem. 

Definitiaftr k Atm hastic mttfd\ is an I x / imalrix S whose entries are 
non negative: 




(13) 
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and whose column sums are equal to I : 

J= L -^ L ( 14 ) 

The interpretation lies in the study of collections of l species, each of which has 
the possibility of changing into another. The numbers 5^ arc called transition 
probabiliiies; they represent the fraction of the population of the ;'th species that is 
replaced by the Ah species. Condition (13) is natural for this interpretation; condition 
(14) spec Hies that ihe Lolal population is preserved. There are interesting 
applications where this is not so. 

The kind of species that can undergo change describable as in the foregoing are 
atomic nuclei, mutants sharing a common ecological environment, and many others. 

We ihall first study positive stochastic matrices, that is, ones for which (13) is a 
strict inequality. To these Perron’s theorem is applicable and yields the following 
theorem* 


Theorem 3, Let S be a positive stochastic matrix, 

(i) The dominant eigenvalue a(S) = 1* 

(ii) Let x be any nonnegative vector; then 

lim S jV jc = c/i, (15) 

jV—00 

where h the domimmt eigenvector and c is some positive constant* 

Proof. As remarked earlier, if S is a positive matrix, so is its transpose S ] . Since, 
according to Theorem 16, Chapter 6, S and S r have the same eigenvalues* it follows 
that S and S T have ihe same dominant eigenvalue. Now ihe dominant eigenvalue of 
the transpose of a stochastic matrix is easily computed: It follows from (14) that the 
vector with all entries L 


(1， …， 1). 

is a left eigenvector of S, with eigenvalue I, It follows from part (iv) Theorem l that 
this is ihe dominant eigenvector and 1 is ihe dominant eigenvtilue. This proves part (i ) 甲 
To prove (ii), wc expand x as a sum of eigenvectors hj of S : 

^ 响， ( i6 ) 

Assuming that all eigenvectors of S arc genuine, not generalized, wc get 


S N x = J2cj k J h r 


(】％ 
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hL ， rc. ths finit L'omptsncnE is tukcn Hy be the dtHnin^nt unc: =■ 1 = I, |Aj[ < L ftir 

/ 〆 From this and ( I6) jV we conclude that 

S N x — 為 (17) 

where v — c\, h — A|. the dominfinl eigenvector, 

To prove Ltiul c is pusilivc. ibrm the ； sculur prnduci uf (17) with $. Since 
卜 S r | = (S T )% we gel 

^ (x t (& r f^) ^ (x^) c(A t |}. (17/ 

have that x is nonnegalive and not cquEil ro 0; ^ and arc positive, 

TTicreforc it tbllciws fmm (I7) r lhat r is positive. This proves pan (ii) tjf Theorem ^ 
when all ci^cnvccrors are genuine. Hie general case can be handled 
similarly ，口 

We lum now to ^pplicalions of Theorem 3 in s^slcms whose change is governed 
by trims!lion proh«ihililies. Denoie by ri.,,..^ ihe populyticin size of ihe jib 

species,/ ^ !. n ： »uppo^ that duiing a unit of time {it year* a day 卜 a nano^x>nd) 

queh individual the col3eclHm changes {or gives birth lo) a mamher oF ihe olher 
species Ejet-onJing Eo the pmbLibilitii：!i If the piipulution size is so large dial 
Ductuation^ are uniim port ant, ihe new size of the population of \ht /th species will be 

yi = (is) 

CtJiribilling the truraponentsaf [he old and new populalicui into single column vctiors 
.v me) v. relation f IS) c^n be expressed in th« Janguuge of matrices as 

y=Sx. (18/ 

After TV unhs uf dine, ihe pupuiiilion vtoEef will be S ,v jf. The significance of Theorem, 
3 in such upplicaliims h ihul it : shows lhal Uk N "+ oo, such pfipuluTiuns tqneJ Eo a 
steady disiribulion that does not depend on where ihe population stalled from. 
Theorem 3 is ihe basis or Google's isearch sirutegy. 

Theorem 1 — iruJ then ； lore 'Hitorem 3 — depend un the positivity of the matrix P: 
in many applitytions we have m deal with msTrices that are merely nonnegative 
How much of Theorem I remains (rue for such malrices? 

The three cxamplc?» k 



show differem behavior The first one has a domiimnt et^envalue: the second has plus 
or minu.'i I as eigenvalues, neither dominiitetJ hy ihe other; ihe ihird bus I as a dcnihlc 
eigenvalue. 
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Ekkkcise 2 , Show that if some power P lf of P is positive, then P lias a dominant 
positive eigenvalue. 

There are other interesting and useful criteria fbr nonoegalive matrices to have a 
dominant positive eigenvalue. These are combinatorial in nature; we shall not speak 
about them. There is also the following result, due to Frobenius, 


Theorem 4 Every nonnegative l x / matrix E F ^ 0, has an eigenvalue X(F) 
with the following properties: 

(i) X(F) is nonnegative, and the associtited eigenvector has nonnegative entries: 

Ph = X(F)/l h > 0, (19) 

(ti) Every other eigenvalue k is less than or equal to 入 (F) in absolute value: 

kl<MF)^ (20) 

(iii) If \k\ = 入 （ F), then k is of the form 

k - e^ klm k{V), ( 21 ) 

where k and m are positive integers, m < /. 

Remark. Theorem 4 can be used lo study the asymptotically periodic behavior 
for large N of S N x, where S is a nonnegative stochastic matrix. This has applications 
to the study of cycles in population growth. 

Proof Approximate F by a sequence of positive matrices. Since the 
characteristic equations of F„ tend to the characteristic equations of F f it follows 
that the eigenvalues of tend to the eigenvalues of R Now define 

k(F) = lim k(F n ). 


Clearly, as n oo, inequality (20) follows from inequality (2) for F", To prove (i), 
we use the dominant eigenvector /(„ of F„, normalized as in (4): 

轨〆 h $=(1， …， l). 

By compactness, a subsequence of h n converges to a limit vector h. Being the limit 
of normalized positive vectors, h is nonnegative. Each h n satisfies an equation 

认 = X{¥ n )h n ; 


letting n tend to oo we obtain relation (19) in the limit. 
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Purt (iil) ia trivial when i(F) — 0; so wu may assume JL(F) > Oi at the c«st (]f 
multiplying F by a constant we msy assume rhnl X(F) - I. Lei be a comp]es 
eigenvalue of F, |ir| = A(FJ = 1 ； then k can he wrillcn 

Denose by y + ft the torrespnnding eigenvector 

F(y + ir) = + 

Separate the real and imaginary parts: 

F v = cos 沒 ; v — sin 8z, 

Fs — ； iini9z 4 - cosii&y. 

The. gcoractric: interpmtalian nf {2J) f that in the plane spanned hy the vcclno v and 
& F is mtttfhm around the origin by 0. 

Consider now ihe plans； furmed all poiiiL^ x of the ftwm 

j = + a> n + bz, (24) 

a and b arbitrary rea] numbers, ft thed£cnvcc(orl ： 19). It fallows fmm (19) and (23〆 
that in Lhif ； plane F acu as mtalbn by 钬 Consider iKm. th£ h«t Q Ibnned by ull 
ntmiie^aiive vdciors .t of funn (24); if Q cunLiin% un open sub 功 l df i!il_ plane (24)* iL 

k it polygon. Since F \s a nonncguiivc malrix 4 iE map^ Q into itself; since il is a 

rotation, it maps Q onto itself. Since Q ban i vertices，I he flh power [if F is the 
idenlity; this shows thalF C? by an an^le 9 = lirk/t. 

fi is dsentid ftir Lhiih ^ir^umcrti ih,ut u ptUyg»n b ihm ih^i it ccmiain jn npen 
set of the plane (24), This will he the case when all componenu of h arc posirive or 
when syme component of It are mtv, bin so are the torrcsponiJin^ components (jf v 
iind : For thi-it all points x of form (24) with |»[, |^j small enough belong to in Lhis 
case ^ is a polygon. 

To complete fhc proof of Theorem we tum lo tlie case when some 

cumpuneiU^ of h ait aero but the iTDmpurbcnLs of' v ur £ arc not. 

Amirtge the Lompt^ncnis in aucb an order that the first ； oumponcnls offr an; i«:ro k (he 
rest positive. Then it follows, from Fh - h that F has the following block fom ： 


( 22 ) 

m 


F = 



卬 ) 


Denote hy ya and the veclors formed bj 1 The firil j 1 com[M)ne.nts v and z, By 
^sumplion. >^j + ^ ^ 0, Since hy (23), y + iz is an eigenveefor of F with 
d gen value 〆'，k fnllmvs from r25) lhaf yq + i'qi ， in an eigenvector of F ffl : 


htiyii + iz») = ^{yo + i^ab 
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Since F fJ is ii nonncr^alive j x j mutriK^ il tolktws Frum part (ii) of ThiMmcm 4 ulrcLtdy 
established that the dominant eigenvalue X(Fp) cannot be less than |^| = I. We 
claim ihat equality holds: X(Fo) = 1. For. suppose not; then the corresponding 
eigtrtvecl&r ^ would satisfy 

F^ t > - (I + S)ho, ho>0,&>0. (26) 

Denote by k the L vector whose first j component are those of ftm, ihe re 对 are zem. ]t 
fallows from (26) rtia{ 

Fit > [l+i)t. (26) f 

It is efljiy Ui show [hat the domiiumi eigenvalue of a non negative matrix cm be 
charactcriicd as the largest a for which (3) can be satisfied. Inequality (26V wnuld 
imply that 入 (F) >1+5, oontnify to Uw iKrnnali 加 ion A(F) - 1. This proves that 
X(F 0 ) = 1. 

Wl- tkj now an Srtductinn with respect lo / cm pari (iii)ofTheEirt:iTi4. Sirtcc is an 
ei 毋 cnvalue of the j x j matrix F t _. and 入 (F[|=). = I, and since / < jf T it follows ihc 
induction hypolhesk ihut ^ is a nttion^] mull!pie of 2 jt with denominalor than or 
equal to j. This completes the proof of Theorem 4. □ 





CHAPTER 17 


How to Solve Systems 
of Linear Equations 


To get numerical answers out of any linear model, one must in the end obtain the 
solution of a system of linear equations. To carry out this task efficiently has 
therefore a high priority; it is not surprising that it has engaged the attemion of some 
of the leading mathematicians. Two methods stil] in current use, Gaussian 
eliminalion and the Gauss-Seidel iteration, were devised by the Prince of 
Malhemalicians. The great Jacobi invented tin iterative method that bears his name* 
The availabilily of programmable, high-performance computers with large 
memories 一 and remember, yesterday's high-performance computer is today's 
pocket computer — has opened the floodgates; the size and scope of linear liquations 
that could be solved efficienlly has been enlarged enormously and the role of linear 
models correspondingly enhanced The success of ibis effort has been due not only 
to the huge increase in computational speed and in the size of rapid access memory, 
bul in equal measure 10 new, sophisiicaied, mathematical melhod^ for solving linear 
equations. At the time von Neumann was engaged in inventing and building a 
programnntble electronic compulor, he devoted much lime to analyzing the 
accumulation and amplification of round-off errors in Gaussian elimination. Other 
notable early efforts were the very stable methods that Givens and Householder 
found for reducing matrices to Jacobi form (see Chapter 18). 

It is instructive to recall that in the 1940s linear algebra was dead as a subject for 
research; it was ready to be entombed in textbooks. Yet only a few years later, \n 
respoiiseMo the opportunities created by the availabilily of high-speed compulers, 
very fast algorithms were found for the standard matrix operations that astounded 
those who though(there were no surprises left in this subject. 

In this chapter we describe a lew represemalive modern algorithms lor solving 
linear equations. Included among them, in Section 4, is the conjugate gradient 
method developed by Lanczos, Stiefel, and Hestenes, 


Linear Ai^ebnt ami hs Appikadom, Second Edithm, by Peter D, Lax 
Copyrighl t 2(X)7 John WWcy & Sons, Inc. 
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The systems of linear equations considered in this chapter are of the class that 
have exactly one solution* Such a system can he wriltea in the form 


A,v = h. 


Uj 


A an invertible square matrix, b some given vector, jv the vector of unknowns to be 
delermined. 

An algorithm for solving (he system (I) takes as its input the matrix A and the 
vector h and produces as output some approximaiion lo ilie solution x In designing 
and analyzing an algorithm we must first understand how fast ami how accurately an 
algorithm works when all the arithmetic operations are carried out exactly. Second, 
we musi understand the effect of rounding, inevitable in computers that do tlieir 
arithmetic with a finite number of digits* 

With algorithms employing billions of operations, there is a very real danger that 
round-off errors not only accumulate but are magnified in the course of the 
calculation. Algorithms lor which this does not happen are called arithmetically 
stable. 

It is important to point out that the use of finite digit arithmetic places an absolute 
limiialion on the accuracy wilh which the solulion can be determined. To undersland 
this* imagine a change Sb being made in the vector b appearing on the right in (1). 
Denote by the corresponding change in .v: 

A(.i: + 5x) 「 b + Sb, (2) 

since according lo (1), Ax — b, we deduce lhat 

AAv = 册， (3) 


We shall compure the relative change in .t with the relative change in b, that is, the 
ratio 


m i i^i 

kT/ 


(4) 


where the norm is convenient for the problem* The choice of relative change is 
natural when the components of vectors are floating point numbers* 

We rewrite (4) as 


|^| \Sx\ _ |A.r| \A~ l &b 
|x| 降 I _ |x| I 肋 I 




The sensitivity of problem (1) lo changes in b is estimated by maximum of (4)’ over 
all possible x and Sb, The maximum of the first factor on the right in (4/ is |A|, the 
norm of A; the maximum of the second Victor is [A_ r |，the norm of A -1 . Thus we 
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conclude that the ratio (4) of the relative error in the solution no the relative error in 
b can not be larger than 


/r(A)HA||A_”. （ 5) 

The quantity tf(A) is called the condition number of the matrix A. 

Exercise i. Show that is > 1. 

Since in A-digit floating paint arithmetic the relative error in h can be as large as 
ICT 气 it follows lhal if equation (L) is solved using fc-digit floating point arithmetic ， 
the relative error in .t can be as large as 10 _ 〜 (A). 

It is not surprising that Ihe larger the condition number ^(A), ihe harder il is to 
solve etjualion (lh for k(A) — oc when ihe matrix A is nol invertible. As wc shall 
show later in this chapter, the rate of convergence of iterative methods to the exact 
solution of (1) is slow when /c(A) is large. 

Denote by ji ihe largest absolute value of the eigenvalues of A* Clearly, 


^ < |A|. 



Denote by a the smallest absolute value of the eigenvalues of A. Then applying 
inequality (6) to the matrix A -1 wc get 



( 6 , 


Combining (6) and (6/ with (5) wc obtain this lower bound for the condition number 
of A: 


4< 啦)， ⑺ 

M 

An algorithm that, when all arithmetic operations are carried oul exactly, 
furnishes in a finite number of steps the exact solution of ⑴ is called a direct 
method. Gaussian elimination discussed in Chapter 4 is such a method An algorithm 
that generates a sequence of approximations lhat tend, if all arithmetic operations 
were carried out exactly, to the exact solution is called an iterative method. In this 
chapter wc shall investigate the convergence and rate of convergence of several 
iterative methods. 

Let us denote by {x ft } the sequence of approximcitions generated by an 
algorithm. The deviation of x n from x is called the error at the nih stage, and is 
denoted by e H : 


e n = x n - x. 
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The umuLint by which the »ih approximLHiun fails tu s^lisfy equation (11 is culled ihe 
川 h fYsidtuti. and Is denoted by r H \ 

r A - k^n- (9) 

Res jdual Einc! error arc rvbtcd tu each ulhcr by 

r a - A^„. ⑽ 

Note ihat T since we du not know a 、we canntit calculate the ciTEirs e„' r bui (inter we 
have calculated x，t we can !>>' formula (9) calculate r, t . 

In what follows, we shall restrict our snatysis to the case when the matrix A is 
rvaL set/cicljitint, iirul [Htsirive\ see Chiipler S and Chiipter 10 for Ihc dcfinilinti of 
ihcMr concepts. We shall use the Euclidean norm, denoted a^, ||||, tn measure the flizo 
of vectors. 

Wl - denote by a und ^ I hi; Krnallc^l und eigenvalues of A, Piwilivc 

dcltnilienees A implies that or k pt}siiivc H see Theunrm I nf Chapicr 10, We recall 

fmra Chapter 8, ThetFuni ]2 T chut Lhc n«mi of a positive rrmlris wilh respect tn ihe 

Euclidean norm is it% largest eigenvalue; 

IIA ㈣ (M) 

Since A -1 also is positive, we conclude chat 

II A ” 卜 〆. (11/ 

Recalling ihc delinirion^ (5) af the condition number A we conclude Ehiit for A sclf- 
adjoinl ami positive, 

k ㈨ (12) 

].THE METHOD OF STEF.PRST DESCENT 

The hrs.| ilcr^itivc melhixi we invcsiigaEc is batiud cm ihc VLtriutiDnul L'hurjjLtcrixution 
of the soEulion of equation (I) in the case when A is positive definite. 

Thwirtm 1. The soEution .t of (I) minimizes iht funciiumil 

Eiv)^^Ay)-(y,b); ( 13 ) 

here (♦) denotes the Etididean scalar product of vectors. 

Proof. We add id E(y) a con^l^unt, thal h, a term imdependent of 

Fiy)=Eiv)+i(x.h}. ( 14 ) 
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Scl (13) in[u (14); using Av — h and ihc Jitlt-^djuintness of A wc cun express 

^Cv) ^jiy-x,My -x))- (J4)' 

Clearly. 

F(j) = 0, 

A toeing pusiiivt nieEins ihat (r, Ar) > (J tur ^ 0. Thus (14 / shuws ihat /" 〔 v) > 0 
for y ^ x. This proves ihat and ihen;fore E(y), tnkcs on ilK minitnum 如 

y = i □ 

Thcortm I shows Thai ihr lusk of ^olvin^ ( I) tan be LurcomplishL-d by minimizing 
K Tn find line pnim where E assymes ils mini mum we sihaill wss ihe method «f 
xteejj^xt dffxceut: thyl given un 叩 pmximutc ： minimiz；cr y, we Eintl y better 
upproxlmation by moving from y to a new poinl along the direction of the iwj^itive 
^taiJicni uff". The 名 rudient uf £ k cosily computed frcjin famtiiEu (13): 

grjJ £(,>■■) — Ay — h. 

So if our nth approximiitioEt is x nt l^n ihe (n + l)sl, 知 +j, is 

-b) t (15) 

wbtre s iff filep Icnglh in ihcdirection - 甚 tad £. Using the concept (9) of ^iduui wc 
can rewriic {15) iu 

x x u =x H -sr fr , ( 15 )' 

Wc dclcritiinc j ko [hut £(^ + |) is as ^nmll els pussibk\ This qu^udralic minimum 
pmblcrri is easily solved: using 113} and ⑼， we have 


赶 D = 批 - 心 A f^ - 巧 ))_ (A — 

= - 4^ ^) +|s 2 ( 〜 Af B h 

Ils minimam is reached for 

,„ (f n ) 

〜 一 J - * T ^ 

A#*„) 


05)" 


m 


ThL'urvin 2. The sequenLT of appn»timalions dclined by (I5) h with .7 given by 
(16), converges lo {he solution x of (I), 
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Proof, We need a couple of inequalities. We recall from Chapter 8 that for any 
vector r the Rayleigh quotient 

{r, Ar) 

of a selt-adjoint matrix A lies between the smallest and largest eigenvalues of A, In 
our case these were denoted by of and fi; m we deduce from (16) that 


J ^ Sn - a' 


We conclude similarly thal for all vectors r. 


- (r ， r) 1 


07) 


(17)， 


We show now that tends to zero as n tends to oo. Since we saw in Theorem 
1 that r{v), defined in (14), is positive everywhere except ai j = x, it would follow 
thal x fi lends to x 

Wc recall the concept (8) of error e t1 — — x, and its relation (10) to the residual, 

A^r, = t- ti - Wc can, using (14/ to express /% write 

FM = \{e n ,Ae fl ) = = |(r„, A~ l r w ), (J8) 

Since E and F differ only by a constant, we deduce from (15)" that 
F(x t ^i) = F(^) - s(r M ,r n ) -Hir(r lh Ar«), 

Using the value (16) for we obtain 

F( Xn+l ) = Fix fi )-j{r n ,r tl ), ( 18 / 

Using (18), we can restate (18)’ as 

F(x fl+ \) = F(x n ) \-s n ( 巧 ::) x * (19) 


Using inequalities (17) atid (17)\ we deduce from (19) that 

r ( 為 … )$ (1 F(x n ). 
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Applying this inequality recursively, we get, using (12), that 

F{xn}< (卜 士) F ㈤ ， (20) 

Using the boundedness of the Rayleigh quotient from below hy the smallest 
eigenvalue, we conclude from (18) that 

^ II ^ || 2 < F(x n ), 

Combining this with (20) we conclude that 

II ^ Il 2 <^(l-^V{^). (21) 

This shows lhal the enmr tends lo zero, ns asserted in Theorem 2* □ 


2, AN ITERATIVE METHOD USING CHEBYSHEV POLYNOMIALS 

EstimiUe (21) suggests lhal when the condition mimber k of A is large, converges 
to X very slowly* This in fact is the case; therefore there is need to devise iterative 
methods that converge faster; this will be carried out in the present and the following 
sections. 

For the method described in this section we need a priori a positive lower bound 
for the smallest eigenvalue of A and an upper bound for its largest eigenvalue: 
m < a.fi < M. It follows that all eigenvalues of A lie in the interval 
According to (!2) f k — [ r therefore k < If m and M arc sharp bounds, then k 

iij M 

We generate ihe sequence of approximalions {x n } by ihe same recursion fomiula 
(15) as before. 


^«+i - (I- s n A)x n + sJk (22) 

but we shall choose the step lengths s n to be optimal after N steps, not after each step; 
here N is some appropriately chosen number. 

Since the solution x of (1) satis 行 es jr — (1 — s n A)x + s tl b, wc obtain after 
subtrucliog this from (22) that 


= (I * SrjA. )^i* 


(23) 


From this we deduce recursively that 




( 24 ) 
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where is the polynomial 

N 

^(a)-nt 1 - - (24) y 

I 

From (24) we can estimate the size of e 鰣 : 

il ^ || < II /MA) ||j 卜 0 ||. (25) 

Since the matrix A is self-adjoint, so is / J J v(A). It was shown in Chapter 8 that the 
norm of a self-adjoint matrix is max \p\,p any eigenvalue of Pw(A), According to 
Theorem 4 of Chapter 6. the spectral mapping theorem, ihe eigenvalues p of P^(A) 
arc of the form p — where a is an eigenvalue of A, Since the eigenvalues of A 

lie in the interval [m* Af], we conclude that 

II ^jv(A) || < max |/^(^)|* (26) 

m<a<M 


Clearly, to get the best estimate fnr || e n || out of inequalities (25) and (26), we have 
U> choose the s tt ,n — 1,,,, T N so that the pulynomial has as small a maximum on 
[m.M] <\$ possible. Polynomials ot' form (24 / satisfy (he nonnali/ing condition 

L (27) 


Among all polynomials of degree that satisfy (27)，the one that has smallest 
maximum on is the rescaled Chebysher polynomiaL Wc recall that the Mh 

Chcbyshev polynomial T A \ is defined for 1 < u < 1 by 


^v{m) cos N$, u — cos 0, 
The rescaling takes [—1,1] into [m, M] and enforces (27): 


Pw(a) = T n 


fM + m — 2a\ / T fM + m\ 
\ M - m )/ 


(28) 


(29) 


It follows from definition (28) that |T, f (w)| < 1 for |w| < K From this and (29) we 
deduce using ^ ^ that 


(jmx |/V(a ) 卜 1 /(29/ 

m<a<M / \tC — \ / 

Setting this into (26) and using (25), we get 

I ^||<l|e 0 |^v (^|)， (30) 
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Since outside the interval [-K l] the Chebyshev polynomials tend to infinity* this 
proves that tends to zero as tends to osx 

How fast tends to zero depends on how large k is. This calls for estimating 
7] 〜 <J + small: we lake S in (28) imaginary: 

一 + 

0 = tt = cos i(j} = - --- = 1 + e- 

This is a quadratic equation for e 余 ， whose solution is 

^ - 1 + € + Vle + e 2 ^ l + V% + O(e), 


So 


㊇ + 衫 - _ 

7>( I + e) = cos iN<f) — --- 


-i{l + 


Now set (k + \ y (k -!)—!+€； then e ^ 2/c, and 



Substituting this evaluation into (30) gives 


I -^N II < 




(30 


(32) 


Clearly, epi lends lo zero as tends to infinity. 

When k is large, \/k is very much smaller than k : therefore for k large, the upper 
bound (32) for e t \ || is very much smaller tlian ihe upper bound (21), n = N, This 
shows that the iterative method described in this section converges faster than the 
method described in Section L Put in another way, to achieve the same accuracy, we 
need to take far fewer sleps when we use the method of this section than the method 
described in Section L 


Exercise 2. Suppose k — 1(H), || \ — L and (]/cr)F(,¥o) = 1; how large do 

we have to take N in order to make j|| e N || < 10— 夂 （ a) using the method in Section !, 
(b) using the method in Section 2? 

To implement the method described in this section we have to pick a value ol'N. 
Once this is done，the values of s tl ,n — 1 ，， … /V are according to (24)’ determined as 
the reciprocals of the roots of the modified Chebyshev polynomials (29): 
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k any integer between 0 and N — \. Theoretically, that is, imagining all arithmetic 
opciaiions to he carried out exactly, it docs not matler in what order wc arrange the 
numbers si. Practically, that is, operating with finite floating-point numbers, it 
matters a great deal. Half the rools of lie in the left half of the inlerva! ///* M]; ior 
these roots ， s > 2/(M + m), and so the matrix (1 — sA) has eigenvalues greater than 
1 in absolute value. Repeated application of such matrices could fatally magnify 
round-off errors and render the algorithm arithmeUcally unstable. 

There m a way of miiigating ihis instabiliiy; ihe oilier half ofihe roots of P N lie in 
the other half of the interval \m. M], and for these s all eigenvalues of the matrix 
(I — ^A) are less than 1. The trick is to alternate an unstable with a stable 


X A THREF^TERM ITERATION USING CHEBYSHEV 
POLYNOMIALS 


We describe now an entirely clifTerent way of generaiing the approximations 
described in Section 2 based on a recursion relation linking three consecutive 
Chebyshev polynomials. These are based on the addition formula of cosine: 

cos(ii ±1)0 = cosOcos nO sin Osin nO. 

Adding these yields 

cos(n + 1 )0 + cos(/i — 1)0 = 2 cos ^ cos n$. 

Using the definition (28) of Chebyshev polynomials wc get 

+ 7^|_|(h) = 2w?V(ii). 

The polynomials P n ， defined in (29), are rescaled Chebyshev polynomials; therefore 
they satisfy un Lin^ilogous recursion relation: 

P ft+i {^) ~ (“《“ + + WjfPti— i (^)- (33) 

We will not bother to write down the exact values of u ti , except to note that，by 

consiruction, P w (0) — I for all n: ii follows from this and (33) lhat 

+ w n - I. (33/ 

We define now a sequence x ft recursively; we pick to, set^i — («oA + I )xq — u\yb 9 
and for n > I 


知十 i — E,W/^A + I — (34) 

Note (hut this is a Ihrce-temi recursion formula, that h, x^i is delermincd in terms 
of x„ and x n \. Formulas (15) and (22) used in the last sections are two-term 
recursion formulas. 
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Subtract x from both sides of (34); using (33/ and Ax — b wc get a recursion 
formula for the errors: 


^ 〜 + MW 卜 (34)' 

Solving (34)’ recursively. \i follows that each e n can be expressed jo the form 
e n = Qn(A)e{u where the Q n are polynomials of degree n, with Q[)= L Setting this 
form of e n into (34),, we conclude lhar the polynomials Q ft satisfy the same recursion 
relation as the P„; since Qq = P 0 = ], it follows that Q n — P n for all n. Therefore 

^ 尸 (35) 

for all th and not just a single preassigned value as in equation (24) of Section 2, 


4. OPTIMAL THREE-TERM RECURSION RFXATION 


In ihis section we shall use a three-Eerm recursion relation of the form 


i = (s n A + p f} l)x n + q n x fl ^i - s n b (36) 

to generate a sequence of approximations lhat converges extremely rapidly lo .y. 
Unlike (34)，the coefficients s ni p tl ， and q n are not fixed in advance but will be 
evaluated in terms of r^_ [ and r lt , the residuals corresponding to the approximations 
and Furthermore, we need no a priori estimates m, M for the eigenvalues 

of A. 

The first approximation xq is an arbitrary ― or educated — guess. We shall use (he 
corresponding residual, / J o = A\\) — h % io completely determine the sequence of 
coefficients in (36)，in a somewhat roundabout fashion* Wc pose the following 
minimum problem: 

Among all polynomials of degree n that satisfy the normalizing condition 

綱二 1 ， (37) 

determine ihe one that makes 

II 0(A)ro || (38) 


as small as possible. 

We shall show that among all polynomials of degree less than or equal lo n 
satisfying condition (37) there is one that minimizes (38); denote such a polynomial 
by G". 

Wc formulaic now the variational condition characterizing ihis minimum. Let 
R(a) be any polynomial of degree less than /j; then aR{a) is of degree less than or 
equal to n. Let t be any real number: Q,i(a) + caK(a) is then a polynomial of degree 
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ksiK than or lo thal ciundiliuoi (37)- Since minimizes 

|| (Q«{A) + <A^(A))fii j| 2 takes on its minimuin at e = 0, Therefore its derivative 
with aspect to t is z£to there: 

(& (A)^Ai?(A)^) =0- (39) 

We define now a xcaiar finnluct for pt^lyntimials JintJ H as follows: 

= (40) 

To analyse this pmduci wc imroduct ihc eigenvectors uf the m^Lrix A: 

Mi = (川 

Since [he mEilri^ A is rcul and self-adjoint, the jff can be taken Do be real and 
<irthunormjil ： sinte A is pOKitivc^ ii» eigenvalues an? pt 执 idv 良 

We expand ra in terms of ihc 

^ = Y, w ^ r ( 42 ) 

Since f} are ci^envcclors of A, they are nJso dgenvectors of (?(A) sml A), sind by 
ihe specmil mapping theorem (heireigenvalues are ami tiUtj}. respeclively. So 

g(A)r fl = ^ wjQidjVj, R(A)rQ ^ ^ (43) 

Since the 乃 are orthorormaL we can express the scalar producl (40Kor polynomiats 
Q and R folbwsi: 

{Q^ (叫 

Theorem Suppose tlial in the expansion (42) of none of the coefflcierUJi Wj 
are 0; suppose r'urlh^r thul tlic eigenvalues iyj. of A ure dimitict. Then (44) fumifihCN a 
Euclidean sliuctun; to the space of all polynomials of degree less than the order A ： of 
llie matris A, 

Proof. AtLording to Chapicr 7, a strular pruduci m^tis ihrce prupcrtics. The first 
iwc] — bilincurity and symmclry 一 arc nbviouN from either {40) or (44). To sh»w 
positivity, we note that sinc€ each ^ > 0, 

{a<3} = ㈣ 

is ohviousiy non negative. Since ihc Wj arc assumed nonzero, (45) is zero iff 
Q(tij) = 0 Fur all ajJ ^ H .“ ， /f. Since the degree of Q is, less ihun K, it con vani^ 
at K points only if 0 = 0, □ 
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We can express the minimizing condition (39) concisely in the language of (Ke 
scalar product (40): For n < K,Q n is orlhogomi} to all polynomials of degree less 
than n. ll follows in particular that Q n is of degree n. 

According to coodition f37), 00 = 1. Using the familiar Gram-Schmidt process 
we can using the orthogonality and condition (37), determine a unique sequence of 
polynomials Q tl ^ We show now that this sequence satisfies a three-term recursion 
relation. To see this we express as linear combination of Q”j n + 1: 


n-tl 


li Qu = ^CnjQj. 
Q 


(46) 


Since ilie Qj are orthogonal, we can express the c n j as 

{ ^Qn t Qj } 


(47) 


Since A is sdf-adjoint, the numerator in (47) can be rewritten as 

(47)’ 

Since for j < n — I, aQj is a palynomial of degree less than tu it is orthogonal to Q tu 
and so (47)' is zero; therefore c n j — 0 for j < n — l. This shows that the right-hand 
side of (46) hasi only three nonzero terms and can be written in the form 


^Qn — + c n Q r} + d n Q„^\, (48) 


Since Q„ is of degree n, b n ^ 0. For n — Ui = (X 

According to condition (37) ， Qjt{0) — 1 for all k. Setting a — 0 m (48) we deduce 
thiit 


+ Qi + d n — 0. 

From (47), with j = th n * 1 we have 

_{aQ lt .Q,} , _ {^Qn i} 

€n JKqJ' n ^{Q ft -uQ t ^y 

Since b n ^ 0, we can express Q /t ^] from (48) as follows: 

Qm+S = (^/1^ 十 Qn - i y 

where 

1 d n 

— t Pit = 一 t 屮 ， =—r * 
bn hn h n 


(49) 


(50) 


(51) 


(52) 
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Noie Lhm ii follow% Ithiti (49) umi (52) ih^[ 

P” + 办 =L (53) 

Tlu:uiuiit^illy + the fbmujlas (5(>) tomplcidy determine Lhc quantiltcs c„ and d". 
Practically, these fbrmu^ps are quite useless T since in onder to evaluate the curly 
brackets wc need to know the polynomials Qi and evaluate 0t(A). Fortuimlcly 
imd ^ cun be cvuliuuTcd mure easily we show next. 

Wf siurt line ulginriihin by chutjsing an j () ; ihcn the nrsi of the x„ arc JcT^rrniiriL'd by 
the recursion (56j t with and q n from formulas (52), {50)i and (49), We have 

delined lo (>e - jt, the nth (irror; suhtmctiri^ x from miLkin ^； use t>1. (53), 

ihaT h = Aj + we ublain. 

ffl+t = (J« A + p H t)e H + q n e^i. (54) 

We claim that 

= Gq(A>eo, (55) 

To see ihk we replace the scalar Eir^umqnl 卩 in (31) hy [he iruklnx urgumeni A: 

Qn^iW ^ (^A + Pri )a r (A) + 9^-1 (A). (56) 

Ltl tHJih sides df (56 > aci on e ^： we geL a tticiimiu'e itrluiiiin thjl \s the uiidc as <54), 
except I hut e t is replaced by Qj^A] 卽 - Since Q)(A) - L the |wn scqu^nccK have ihc 
sam? siading poinu and therefore they are the S4ime. as parted in (55), 

We rccaEI now lhat the residual r w — A.r fl -h is rcSaied lo = x N - .t by 

r。= Aiv Applying A lo (55( wc obiain 

f 1 !, = Q»Wni- (57) 

Applying the mapping A Ut (54) gives u rctursiun rcl^dun fur ihc residuals: 

^n+i - (J B A +p»I)r il + q n r tt . y . (58) 

We now ^exQ = 二 Q n inlo (40), and use relation (57) lo wile 

(59) 

SLihsequenlly we sel Q = ttQj,.R = inlo f40ji T and use relation (57) to write 

{^Gfl ■ = (A^r,, Ar w ) r (59) 

Finally we set Q — <aQ n and R — 0n—t into (40). and we use relation (57) to write 

{aQ ai Q a - i \ = {Ar ni Ar a ^). (59 疒 
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We set these identities imo (50): 


— (Ar„,Ar n ) 


(爹 — (Ar，" A/V|_ I) 


(60) 


From (49) we determine h n — — {c n 4 - d n ) r Set these expressions into (52) and we 
obtain expressions for s n ,p ri , and q n that are simple to evaluate once r n - \ and r ft are 
known; these residuals can be calculated as soon as wc know — ] and x n or from 
recursion (58). This completes the recursive definition of the sequence 


Theorem 4. Lei K be the order of the matrix A, and let be the ^ih term of ihe 
sequence (36), the coefficients being defined by (52) and {60). We claim that x K 
satisfies equation (1) ， A_x^ = fo. 


Proof. Qk is defined as that polynomial of degree which satisfies (37) and 
minimizes (38) - Wc claim that this polynomial is p\/pA(^)rP\ the characteristic 
polynomial of A; note that /? A (0) / 0, since 0 is not an eigenvalue of A. According 
to the Cayley-Bami Iron theorem, Theorem 5 of Chapter 6, p A (A) — 0; clearly, 

(A) = 0 minimizes || 0(A)r fl ||. According to (57), r K = (?^( A ) r 0f since 
according to the above discussion, (?f ： (A) 0, this proves UkU the A"th residual 

tk is zero, and therefore exactly solves (1)* □ 

One should not be misled by Theorem 4; the virtue of the sequence^ is nol that it 
furnishes the exact answer in K steps, but that, for a large class of matrices of 
practical interest, it furnishes im excellent approximation to the exact answer in far 
fewer steps than K. Suppose for instance that A is the discretization of an operator of 
the form identity plus a compact operator. Then most of the eigenvalues of A would 
be clustered around I; say all but the first k eigenvalues of A arc located in the 
interval (1 — 5,1 + 5), 

Since Q ri was Uelmed as the minimizerof (38) subject to the condition 00) = L, 
and since according to (57), ^{A)r(j = r n , wc conclude that 

II r, || < || Q(A)ro || 


for any polynomial Q of degree n that satisfies (2(0)= 里 . Using formula (45) wc 
write this inequality as 


II 〜|| 2 S 5>> 设 2 ㈣ ， (61) 

where the arc the coefficients in the expansion of / 

Wc set now n — k + and we chooser Q as follows: 

e ⑷ = n ( ! - £ K ¥)/ 叫- 〖")； ㈣ 



condition (37.3, ^(0) - 1. Forrt large. is rttunsnafed by ils lepjding term, which 
h Therefore, 


Mr)hK^ 


By construction, Q vaniiihe^ at .wi，We have assumed tlm[ dl ihc other ^ Ik in 

{I — I + ^); since (he Chcbys^hcv polynnniials do nest exceed l in absolute value in 
(-J, I), it follows from (62) and (63) thar forj > Jt, 

|£?(tf/)| <consrf^ , (64) 


Sluing pll this inft^mmtiion (tbnul ^(pj) int» (61) w? ohtatn 

tl 4+/ \\ l < const 2 ^ < consl^0 \\m\\l 


For example if |a ； — 1| < 0,2 for j > K> t and iF ihc cdhsLeiiii (65) is less than 10 + 
I hen choosing / = 2(J in (66} makes |J || less ihan I0' 1? || fri ||. 

Exercise 3. Write a iuimpu^r program lu evaluale the qujntllies x H , p n , jnd q n . 


Exi-rcim ； 4, U fie 1 he computer program 10 solves system pf equal ions of your 
choice. 




CHAPTER 18 


How to Calculate the Eigenvalues 
of Self-Adjoinl Matrices 


L The hEUiis of ore of the most effective methods for calculating Eippmsimalcly the 
eigenvalues of a scif-adjoinl matrix bused un the QR dt^iiinpusilion. 

Theorem L Every real invertible square matrix A can be factored as 

A=rQR, (1) 

when* <J h nn urthugonul maErix and R is an upper Lriun^ular matrix whose diugunal 
entries an: pt>KiLive, 

fnmf. The toluinns of Q aie conslmcteJ mi of tlie ci^Lumn^ oi A hy Gnim- 
Scbmidl orthtmumializiiilion. So thejih column of Q is ll linccirLOinbinuliuTi ol Ihc 
Hm j columns _ itj of A: 

9i 

也 =C\iU\ + c-niii, 

etc. We can invert the rclulion bclwcen Tilt q — j and ihe « —丨 


«i = nig\- 
tt2 = fuqi + 

+ … + 铷 . 


( 2 ) 


linear A {辦 bm tmd lix Apptic^imf^ Setmul Edilkm^ by Peter D. Lux 
Ccipyri^ht I 21X)7 John Wiley &. Srni 、 Ini：. 
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Since A is invertible, its columns are linearly independent It follows that all 

coefficients T\ \ . r m in (2) arc nonzero* 

We may multiply Liny of the vectors qj by -】，without affecting their 
orthonormality. In ihis way wc can make all the coefficients i m [ ^ . n r Wi in (2) 

positive. Here A is an n x n matrix ， 

Denote the matrix whose columns are q\,.., ,g lt by Q, and denote by R the matrix 


Ry = 



for i < j. 
for/>j 


( 3 ) 


Relation (2) can be writlen as a matrix product 

A = QR. 


Since the columns of Q are orthononnal, Q is ati orthogonal matrix. 

It follows from the definition (3) of R that R is upper triangular So A — QR is the 
sought-after factorization (J). □ 

The factorizalion (1) can he used to solve the system of equations 

Aa = fi. 

Replace A by its factored form ， 

QRv = w 

and multiply by Q T on the left. Since Q is an orthogonal matrix, Q T Q = I, and we get 

Ra = Q t u. (4) 

Since R is upper triangular an<J its diagonal entries are nonzero, [he system of 
equations can be solved recursively, starling with the mh equation to delermine x tlf 
then the (n — l)st equation to determine and so all the way down to xu 
In this chapter wc shall show how to use ihc QR factorization of a real symmetric 
matrix A to find its eigenvalue. The QR algorithm was invented by J ： G,E Francis in 
1961; it goes eis follows: 

Let A be a real symmetric matrix; wc may assume that A is invertible, for wc may 
add a constant multiple of the identity to A. Find the QR faclorization of A: 

A-QR. 


Define A[ by switching the factors Q and R 


Ai = RQ. 


( 5 ) 
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We cluim Ltiut 

(L) A| is real and symmelric. and 
111) A| has the same eigenvalues as A. 

To sue ihese we express R in Icrms uf A EindQ by umhiplying cqualiun (I) by Q T 
un I he left，Since Q'Q = 1. we gel 

Q r A = R. 

Setting ihi>i into (5) gives 

A, = Q t AQl (6) 

from which (i) and (ii^ follow. 

We continue ihis process, getting a sequence of mairicei {Ai }, each linked to llie 
n^xl {>n^ by the rdiitions 

A*_i = QfRt 
At = RjtQi 

From (hese we dcduce T ita before, thul 

Ai = QjAi- iQji - (9)* 


li follows Ihai all ihc maliiccs Aj ； iirc symmciric 1 and [hey olE have the same 
eigenvalues. 

Combining the reliition^ (9) t _ Jn ,... ( 9 j, we get 



Ai = Q WT AQ ⑻， 

00 k 

where 




Q f *)=Q,Q ? , ^Q a . 

⑴) 

Ddine similarly 




= Hi … Ri 

(12) 

We claim that 




A 1 - 

(叫 


For k = 1 this is fdatton UK Wc argue inductively; suppose is true: 

aW = Q^"A ㈣. 


(V k 

(S)i 
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Multiply Lhis by A on ihc kfi: 

A* = AQ I *' I) R^ U . (H) 

Multiply cquiiiiaii (10^ t by Q !i 1J the left. Since Q 1,1 ” isi a prt>diict sif 
orthogonal miitTECCs, k k itseEf utthn^utiuln and Q” 0 (1_l|rr = L So wc gel lhat 

n]=AQ^-"‘ 

Combining this with (14) gives 

A^Q^A^R^K 

Now us^ (7 ) a lo express Aj： [ T and we get reblion f 13) t , 

This completes ihe inductive proof of (n) 4 . 口 

Formula (H2) djcrtiu-s R (斛 as fhe prtxJuct of upper iriungular nuitritics. Thtmefoire 
R ⑷ itself is upper triangular, and so (13) f it the QR fm iorizaikm of A^. 

Denoft the n^m^li/cd eigenvectors of A by ui .H m , its iXMTtsponding 

eigenvalues by ” … d m . 

Dengue by U ihe mains whewe ctjlumns ys the eigenvecuirs. 

U = (i /| ，… “/ m )， 

and h) 1 D ihe cii^onal matrix whose entries are dy .r/j„. The spectral 

rcprc^nisiiion of A is 

UDU t . ( 15 ) 

TherefOTe ihc spectral represeniation of A* is 

A* = UD*U t , (t 5 ) t 

li follows from formub (I5) t ihai ths columns of A i arc I incur combi iiations of 
I he eigenvectors of A of ihe following form: 

办 idtwi + … + ‘ [15/ 

where . b f „ not depend on k. We assume now that ihi： eigenvalues of A are 

distinct und posi.iivc: urtyngt: ■ hem in dunging onlcr: 

d\ > fit > ., , > > 0. 


It follows I hen from (15V that, pnwidctl / 0, for k large enough the first column 
of A t is very clo^e lo u mulbpk of . Therefore ^ 1 ,*^ the ^uIliaiii of 
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Q ⑻， is very dose to / 〜 Similarly, the second column of Q^ k \ would be very 

dose to t( 2 , and so on, up 10 2 ^ u f} . 

We turn now to formula (10)〆 it follows that the ith diagonal element of 
A a is 

(A;)^ = <?J A)T A 9 | i} = (g^Aqrf), 

The quantity on ihe right is the Rayleigh quotient of A, evaluated at q k ^ It was 
explained in Chapter 8 that if the vector differs by € from the ;th eigenvector of 
A, theo the Rayleigh quotient differs by less than 0(f 2 ) from the ilh eigenvalue d t 
of A. This shows that if the QR algorithm is carried out far enough, the diagonal 
entries of Ajt are very good approximations lo ihe eigenvalue of A, arranged in 
decreasing order, 

Exerc'ISH i. Show that the olT-diagomil enirie^ of tend lo zero tis k tends 
to oo. 

Numerical calculations bear oul these contentions. 

2. Next we describe another algorithm, due to Alston Householder, for 
accomplishing the QR factorization of a matrix A. In this algorithm, Q is constructed 
as a product of particularly simple orthogonal transformations, known as reflections. 

A Householder reflection is simply a reflection across a hypcrplunc, lhat is, a 
subspace of form v t jc = 0. A reflection H maps all points of the hyperplane into 
themselves, and it projects points x off the hypcrpltine into their reflection across the 
hyperplane- The analytical expression of H is 

t . 

( 16 ) 

IMh 

Note thcit if we replace v by a multiple of v, the mapping H is unchanged. 

Exercise 2. Show that the mapping (16) is norm-preserving. 

We shall show now how refleclions can be used to accomplish the QR faclorization 
of a matrix A* Q will be constructed as the product of n reflections: 

Q = H，, H , 卜 i . *. H 卜 

Hi is chosen so that [he first column of Hi A is a multiple of ei — (1 T 0,.»., 0), Thai 
requires that Hi a! he a multiple of ^i ； since H] is norm-preserving, that multiple has 
to have absolute value || a\ ||. This leaves two choices: 


H]fi| = j| a\ || e\ or Hk/i = — || … || e \, 
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Setting x = a\ into (16)，we gei for Hi the relation 

0[ - ¥ - II (if II oc a\ - ^ - II Cl\ 

which gives two choices for v: 

v+ = £ii- || a\ ||or v- = a\ + || a\ ||£fi ， (17) 

We recall that the arithmetical operations in a computer carry a finite number of 
digits* Therefore when iwo nearly equal numbers arc subtracted, the relative error in 
the difference is quite large. To prevent such loss，we choose in (17) the larger of the 
two vectors or v_ for \k 

Having chosen Hi, denote Ht A as Ai; it is of form 

( X X … X 

0 

0 A ⑴ 

where A (n is an (n — 1) x (n — I) matrix. 

Choose to be of the ibrm 

/ 1 0 … 0\ 

0 I 

H 2 = : ， 

^0 h (2> / 

where H( 2 ^ is chosen as before so that the first column of 

H ⑺ A") 

is of the form (x ， 0”*”0) T , Then the first column of the product H^Ai is the 
stime m the first column of Ai, while Ihe second column is of the form 
(x f x 1 0, " ” 0) 丁 . We continue in this fashion for n steps; clearly, A fJ = H„ " ♦ H 1 A 
is upper triangular Then we set R = A„ and Q = Hf … H: and obtain the QR 
factorization {l) of A. □ 

Next wc show how reflections can be used to bring any symmetric matrix A into 
tridiagonal form L by an orthogonal similarity transformation; 

0A0 t = L, ⑽ 



0 is a product of re fled ions: 


0 = H^_j •..Hi, 


(18/ 
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K| is of the ttirra 


/ I 0 … 0 

0 H( 13 

Hj =. 

\0 

Denote ihe firsl column of A as 

= (』>)， 

where o l M is a cnlunin vector w\\U ；p - } component^. Then the jictinn ofH, of fom 
(19) is as follows: 

Hj A hus ihu Mintc finit row as A, und th(! last n - I entries tif ihu tlnit uoluinn ； (if 

H, A is H m a {1 K 

We choow H | IJ as a reflexion in that nnap« d■) imn a vector whose Ieihuj - 2 
componenLii ： are zero. Thus the Urst column of H|A zeros in the last n - 2 places. 

Mulliplying an ff x ji mairix by u matrjji of the fiMm (19) on righi Leaves Ihq 
first column unaUcncd, Therefore rhe first column of 

A, = H|AH| 

has zeros in the Iasi tt - 2 rows, 

in Lhc ncKt step we choose H> of ihc form 




where H 1 ' 1 is yn (n - 2) x (it - 2} reflect km. Since the linil column of A i has zeros 
in the last» - 2 rows, the first column of Hj A] is the same as the first coEumn of A|. 
Wc ehuosc ihcrefletiiun H'" : thul ihc second column of t^A] has Keros in the Iasi 
n — 3 row 、 

Hbr of form (20), multiplication on the right by H? leaves ihe first two columns 
urKiliai^d Therefore 

A 2 = H^Ai H] 

has a 2 and it -3 zeros, respectively, m the first and second columns. Continuing 
in this fashion, wc {!onstrue[ the refleoLituiii H3, ....Their product 
f) = Hn_i...H| hati the property that 0A0 T has a】.l ijth entries zero when 
1 >7 + ]: Bui since OAO t b symmetric T so are all eniries forj > 1 + ], This shows 
that 0A0 T is iridiiagonal. □ 
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We note that Jacobi proposed an algorithm for tridiagonaliring symmetric 
matrices. This was implemented hy Wallcc Givens* 

Theorem 2. When the QR algorithm (T) 。 (8)^, is applied to a real, symmetric, 
tridiagonal matrix L f all the matrices pnxluced by the algorithm are real, 
symmetric, and tridiagonaK aod have the same eigenvalues as L, 

Proof. We have already shown, see (9)^, that L t is symmetric and has the same 
eigenvalues as L. To show that L，is tridiagonal, we start with L — Lt^ rridiagonal and 
then argue by induction on k. Suppose La —| is tridiagonal ami is factored as 
. We recall (hut the)th column qj ofQ^ is a linear combination of the first j 
coluinns of L 欠 ； since is tridiagonal, the last n j 1 entries of are zero. The 

jth column of R^Q A is R 冲 : since R ； is upper triangular, it follows that the last 
打一 ^ /一 1 entries of R^cjj are zero. This shows that the ijth entry of = R^Q is zero 
for i > j + l- Since is symmetric，this proves that is tridiagonal，completing 
ilie induction. □ 

Having L, and thereby all subsequent Li ， in Lridingonal fomi greatly reduces the 
number of arithmeiic operaiions needed 10 carry out the QR algorithm. 

So the strategy for the tridiagonal case of the QR algorithm is to carry out ihc QR 
iteration until the off diagonal entries of L& are less than a small number The 
diagonal elements of Lx are good approximations to the eigenvalues of L. 


3. Deift, Nanda, and Tomei observed that the Toda flow is a continuous analogue 
of the QR iteration. Fiaschka has shown that the differential equations for the Toda 
flow can be put into commuior fornu that is, in the form 


士 =BL - LB ， ⑵) 

where L is a symmetric tridiagonal matrix 


/a\ b\ 0 \ 

h } a 2 、 

L — 

V 0 o n / 


(22) 


and B is the antisymmetric tridiagonal matrix 
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Exercise 3* (i) Show that BL - LB is a tridiagonal matrix* 

(ii) Show thai if L satisfies the differential equation (21its entries satisfy 


j { a k ^ 2{bi - bl.l 


(24) 


where k = I ， ... ， 行 and h l} = h n ^ 0, 

Theorem 3. Solutions L(r) of equations in commutator form (21)，where B is 
antisymmetric, are isospectmL 

Proof. Let the matrix V(/) be the solution of the differential equation. 

*V = BV. V(0) = L (25) 

at 

Since B(r) is antisymmetric, the transpose uf (25) is 

jV T = -V t B ， V'O) = I. (25) t 

Using Ihe producl rule for differential ion and equations (25) and (25) , we get 


x 基 v x 

=—V T BV 十 v t bv = 0. 

Since V T V = 1 at / = 0, it follows that V 1 fO v (0 = I for all /, This proves that for all 
/, V(r) is an orthogonal matrix. 

We claim that if L</) is a solution of (21) and V ⑴ a solution of <25)，then 


v T (0L(f)V{r) 


(26) 


is independent of Differentiate (26) with respect to r; using the product role, we get 


» v ‘_. ⑸ v+vTl > 


(27) 


Using equations (21)* (25), and (25) 丄 ， we can rewrite (27) as 

-V t BLV + V t (BL - LV) + V t LBV, 
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which is zero. This shows that the derivative of (26) h zero, and therefore (26) is 
indcpendem of At f — 0, (26) equals L(0), since V(0) is the identity; so 

V T (f)L(i)V(r) = L{0) ‘ (28) 

Since V{/) is an orthogonal malrix, (28) shows that L(r) is related to L(0) by an 
orthogonLil similarity. This completes the proof of Theorem 3. □ 

Formula (28) shows that if L(0) is real symmetric — which we assume — then L(/) 
is symmetric for all /, 

The spectral rcprescnlation of a symmetric matrix L is 

L-UDU 丁， (29) 

where D is a diagonal matrix whose entries are the eigenvalues of L, and the columns 
of U arc the normalized eigenvectors of L; (29) shows that a set of symmetric 
matrices whose eigenvalues are uniformly bounded is ilself uniformly bounded. So 
we conclude from Theorem 3 that the set of matrices L ⑺ are uniformly bounded. It 
follows from this that ihc system of quadratic equations (24) have a solution for all 
values of t. 


Lemma 4. An ofi-diagonal entry /，（(，) of. L(0 is either nonzero for all t. or zero 
for all I. 


Proof. Let [to % t\ be an inlen^il on which b^t) is nonzero. Divide the differeniial 
equation (24) for by and integrate it from to 


log b k (t { ) - log fc^/o) 



( 取 +1 — a k)dt ‘ 


Since, as we have shown, the functions 與 are uniformly bounded for all ^ the 
integral on the right can tend to oo only if or (\ tends to oo. This shows that 
log h^{i) is bounded away from — oc. and therefore h^(t) is bounded away from zero 
on any interval of r. This proves that if bk{t) is nonzero for a single value ofit is 
nonzero for all t □ 


If one of the off-diagonal entries b k of L(0) were zero, the matrix L(0) would fall 
apart into iwo nuurices. We assume that ihis is not the cuse; then it follows from 
Lemma 4 that the b^t) arc nonzero for all / all it* 

Lemma 5* Suppose none of the off diagonal terms ^ in L is zgvo. 


⑴ The first component U\^ of every eigenvector u^. of L is nonzero, 
(ii) Each eigenvalue of L is simple. 
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Proof ， (i) The first component of the eigenvalue equation 

L/4 = d k u k (30) 

is 

«1W|A = d k uu. (31) 

If were zero, it would follow from (31)，since ft] 一 0, that — 0^ Wc can then 
use the second component of (30) to deduce similarly that — 0; continuing in 
this fashion, we deduce that all components of arc a contradiction, 

(ii) Suppose on the contrary that is a multiple eigenvalue; then its cigenspacc 
has dimension greater than L In a space of dimension greater lhan one, we can 
always tind a vector whose first component h zero; but this contradicts part (i) of 
Lemma 5, □ 

Lemma 6. The eigenvalues d u …， d tt and the first components 
k = 1, … of the normalized eigenvectors of L uniquely determine all entries 
以 h . *, ， and b]” " ， b 、* of L. 

Proof. From the spectral representation (29), we can express the entry L" — a\ 
of L as fallows: 

a i — ^ u ik* (32) u 

From equation (31) wc get 

*i"2i = (4 — (33) • 

Squaring both sides and summing with respect to k gives 

l>\ = 53 (A - aifiqk: (34) j 

here we have used the fact that the matrix U is orthogonal, and therefore 

«lt = i • 

Wc have shown in Lemma 4 that (f) docsuH change sign ； therefore is 
determined by (34) r We now set this determination of by into (33) 1 to obtain the 
values of 吻 ' 

Nexi we use the spectral representation (29) again to express az = L 22 ^ 


^2 = dkti ^- 


(32) 2 
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We proceed as before to the second equation in 30, which we write as 

^ -h\tiu + (4 - 從 2)W2A” （ 33)2 

Squaring and summing over k gives 

b\ = ^2 (— + (dk ~ «2) 收）， (34) 2 

and so oil 

JOngen Moser has determined the asymptoiic behavior of L(r) as t lends 
to oo. □ 


Theorem 7. (Moser)* L(0 is a solution of equation (21). Denote the eigenvalues 
ofLby , rf fl , arranged in decreasing order, and denote by D the diagonal matrix 

with diagonal entries 山 ” " 卜 Then 


lim L(t) = D. (34) 

I—rx. 

Similarly, 

lim L(0 = D- : (34)_ 

f—-oo 

where D- is the diagonal matrix whose diagonal entries are ci”， … ' 山 . 

Proof. We start with the following lemma* □ 

Lemma 8, Denote by u{t) the row vector consisting of the first components of 
the normalized eigenvectors of L(r): 

m = («"” . ” (35) 


Claim: 


“(/) 


u{0)e 


Dt 


|| u{0)e Di I ^ 


(36) 


Proof. We have shown that when L(/) satisfies (2J), L ⑴ and L(0) are reluted by 
(28). Multiplying this rehuion by V(f) on the left gives 


L ⑺ V(f) = V(f)L(0). 


(28/ 
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Denote as before the normalized eigenvectors of L(/) by «((，). Let (28)^ act on "“0). 
Since 

卜柄 (0)， 

wc get 

L(t)V(t)u k (0) = d k \(t)u k (Q). 

This shows that V ⑺％⑼二 tfjtfO are the normalized eigenvectors of L(f). 

V(f) satisfies the differential equation (25)，^V = BV. Therefore i#t(0 = 
V(f) ； ii(0) satisfies 


d 

~d\ 




(37) 


Since B is of form (23)，the first component of (37) is 


d 

Jt 


wu = hiu 2k ^ 


(37 )， 


We now use equation (33) _ U) rewrite the right-hand side: 




Define /(f) by 


/(0 = / a } {s)ds, 
Jo 


Equation (37 )〃 can be rewritten as 


4 ， Sn(r)=0 f 

at 

from which we deduce lha 【 


(37)" 


where c\ is a const ant. So 
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where F(f) — exp[f(r)j. Since/{0) — 0 + F{0) = 1, and Ck ~ W|*(0); obtain 

ui^t) = W]i (0)^F(/). (38) 


in veclor noiatton, (3S) hecumes 

a ⑴=«(。> & 聯 

Since «(/) is the first row of an orthogonal matrix，il has norm I, This shows Lhal F(/) 
h the nommlizing faaor, muJ il proves rormulu (36). 口 

(1) Since iht: e^igcnvalued of L^J lire d^tinct fstc Lcimnii 5) t furttuila (36) shu^s 
I hat as / —^ oc, Uw firsi component;/(]{/) of d(/) is exponentially Isr^enhun the other 
components. Since the vector w(/) has norm U it follows that as l -* oo, wi |(/) tends 
lo L and t > 1 — iciuJ lu J ： m> ut an cxpuncnliul rule. 

(H) Next we lake equation (32 ),； 

«i(/) - ⑴‘ 

Fmin whm wc have shown uhnut "■!；[)), U follows lhai £i|(f) tends in ai an 
exponenli^l rjte as f ^ oc. 
fUi} Tu cstimuic Jb| W€ lake the rcprcsciH£tiioii (34),: 

Fnira <a I (f) ^ d] iind the fEicI [Kill m ⑺ js a uuii vector, we deduce that 点]⑴ t^iuls hi 
zero cit an CKpon^ntiaE rate as t —* oo. 

The lirsil two rows of u arc orthogonal ： 

= 0 (3®) 

AcLHirdin^ lu (i) ， U\\ (/) ^ I and ^ cxptmcnlially as l oo. It roJIows 

Uuetiertflne frcun (39) Ihm Mii(j) —* 0 exponentially as r —» oo r 
<y ) From (31 > wc deduce that 


f*n - fli wu 
^2 - til Inj 


By I he explicit formub (38) wc can write this as 
unU) ^-ndt)^) 

«22^) di -<3|{f)Wj2(0) 


(40) 


(41) 
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Take k > 2; then the right-hand side of (41) tends to zero as / — do, and therefore 
—» 0 as / do for k > 2* Wc have shown in (iv) that "21(0 0 as t oo. 

Since ( 1121 ，■ " , “ 2 ”) is a unit veclor, ii follows that ^22(0 ~^ 1 exponentially. 

(vi) According to formula (32 )’， 

^ 2(0 ™ 〉: 

Since we have shown im (v) that ^(f )0 for k ^ 2 and that W 22 — 1， it follows 
that 议 2(0 — 办 - 

(vii) Formula (34 ) 2 represents hi{t) as a sum. We have shown above ihiM all 
terms of this sum tend to zero as f — oc. It follows that 62 (f) ^ 0 as 
t —^ oc, at the usual exponential rate. 

The limiting behavior of the rest of the entries can be argued similarly; Deift et al- 
supply all ihe details* 

Identical arguments show that L(/) tends to D as f 一 -oo, 

Moser’s proof of Theorem 7 runs along different lines. 

We conclude this chapter with lour observalions. 

Note L It may surprise the reader that in Lemma 8 we present an explicit solution. 
The explanation is that the Toda lattice, of which (21) is a form, is completely 
intc^rable. According to Liouville's Theorem, such systems have explicit solutions* 
Note 2. Moser’s Theorem is a continuous analogue of the convergence of the QR 
algorithm to D when applied to a mditigonal matrix. 

Note 3. Deifl et al* point out that (21) is only one of a whole class of flows of 
tridiagomil symmetric matrices that tend to D es / —^ 00 . These flows are in 
commutator form ( 21 )，where the matrix B is taken as 

B-/7(L) + ^p(L)„, 


where pis a polynomial, M+ denotes (he upper triangular part of M, and M_ denotes 
its lower triangular part. The choice (23) for B corresponds to the choice p(L) = L + 
Note 4. Dei ft el nl. point out Ihat solving numerically Ihe matrix dilferential 
equation ( 21 ) until such rime when b 、， …, b n become less than a preassigned small 
number is a valid numerical method for finding approximately the eigenvalues of U 
In Section 4 of ihcir paper they present numerical examples comparing ihc speed of 
this method wiih the speed of the QR algorithm. 
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Solutions of Selected Exercises 


CHAPTER 1 

Ex L Suppose c is another zero: 

x + z = x for all 

Set = 0: 0 + ^ = 0, Bui also z 十 0 = so Z = 0- 

Ex 3. Every polynomial p ol' degree < n can be written as 

p — £ 1 |_^ l + 口 2 戈’ f ‘ + * * * O n , 

Then p ^ ( 沒 卜 … is an isomoq^hism* 

Ex 7* If x\, X 2 belong to X and to Y, then xi + X 2 belongs to X and K 
Ex Id. If x, were 0, then 

I .Xi + ^ 0j[j — 0. 

E\ 13. (iii) If X[ — X 2 h in K ajid X 2 - x] is in Y, then m is their sum 

Xi ~ X2+JC2 — ^3 = Xi -X 3 . 

Ex 14, Suppose {jT[} and {■¥!} have a vector xi in common. Then = A] and 
jcj ^ x 2 \ but then a 】 三 so {.YI} = 


Linear Algebra wul Its AppUcmkms, Second Edirimh by Peter D. Lax 
Copyright ( HM)7 John Wiley & Sons, Inc. 
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Ex 16 - (i) Polynomials of degree < /? that are zero a( ij” …❼ can be written in the 
form 

_) n o - 仏 

where q is a polynomial of degree < n — j. Them clearly form a linear space, whose 
dimension is n — j. 

By Theorem 6 f 

dimX/F^ dimX- dim Y ^ n - {n 一 f)=j. 

The quotient space X/Y can be identified with the space of vector 

…， 尸 ⑹). 

Ex 19, Use Theorem 6 and Exercise 18* 

Ex 20. (b) and (d) arc subspaccs 

Ex 21. The slatemeni is false: here is an example lo the contrary: 

X — U 2 ^ (jc,y) space 
U = {y = Q},V= {x = 0} ? W = {x = j}. 

u + v +w ^ {0},unw ^ {0} 

yn 屮 ={0},c/n vmv = 0, 

So 

2#l + l + l- 0«0^0-0. 


CHAPTER 2 


Ex 4 Wc choose m\ — 
p{t) = f 2 ? (9) says that 


m 3 ； then (9) is satisfied for p(t) = For p(t) — I and 


2 = 2m \ + 川 2 , 



So 



and m 2 = 



from which (ii) follows, (iiij (9) holds for all odd polynomials like r; and 1 ' For 
/ >(/) ^ f 4 * (9) says that 



2 州 1 a 4 



which holds for a : 
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Ex 5, Take m\ — mi — m 3t in order ro satisfy (9) for all odd polynomials. For 
Pit)- 1 and />{/} — t 2 we get two equations easily solved* 

£\ 6, (a) Suppose there is a linear relaiioo 

al\{p) + bl 2 {p) + ch{p) = ^ 

Set p(x) = (x - 餐 2 )( 太 ~ 書 3 ). Then p(| 2 ) - p($ 3 ) : ^ 0; so we get 

from the above relation that a = 0, Similarly b = 0, c — 0. 

(b) Since dim — 3, also dim 产 2 — 3- Since /j # 4* h arc linearly independent, 
they span P f 2 . 

(c 2) Set 

P\ W = ( x ~ — — ？ 2)(?1 — ？ 3 )， 


and define p:' py analogously. Clearly 


^(Pj) = 


1 if ( = j 
0 l 


Ex 7. £(.r) has to be zern for x (1,0. —1,2) and x = (2,3 ? L I ). These yield two 

equations forc*”. ，， e 4 : 


C\ — Cj + 2 c4 = 0, 2cj + 3c? + O + c 4 — 0- 

We express c } and r! in terms of and € 4 . From ihe first equation, c { = — 2 弘 

Setting this into rhe second equation gives e! = —€3 + € 4 , 


CHAPTER 3 

Exl. If 7>, =w 卜 Ty 2 = " 2 , then T(\\ y 2 ) = u\ and conversely- 

Ex 2. Suppose we drop the iih equation; if ihe remaining equations do nol 
determine i uniquely, there is an x that is mapped into a vector whose components 
except the rth arc zero. If this were true for all i — i” " ， 出 ， Ihe range of the mapping 
x ^ u would be " 卜 dimensional; but according to Theorem 2 t the dimension of the 
range is <n< m. Therefore one of the equations may be dropped without using 
uniqueness; by induction m — n of the equations may be omitted* 

Ex 4 Rotution maps the parallelogram 0.x.y\ x + y into another parallelogram 
0 ,/,/^; therefore — / + i/, 

ST maps (1, 0. 0) into (0, 1 ? 0); TS maps (1, 0, 0) into (0, 0, 1). 
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Ex 5, Set Tx - u; then (T~ l T)x= T l i4 ^x f and (r 广 _)h = n = 

Ex 6 . Part (ii) is true for all mappings, linear or nonlinear Part (iii) was illustrated 
by Eisenstein sis follows: ihe inverse of putting on your shirt and then your jacket is 
taking off your jacket and then your shirt. 

Ex 7. (( 5 rj/,x} = (/, (ST) f x); 

also 

({STl y x) = (Ti^x) = (l\ rsfx), 
from which (Sl'Y = T f S f follows. 

Ex S. (T!,x) ^ (L Tx) = (r f Lx) for all x; therefore 77 = T r, l. 

Ex 10. If M = SKS ■, then S 1 MS 二 K，and by Theorem 4, 

S- 】 NT 【 S= IC【. 

Ex 11- AB 二 ABAA -1 = A(BA)A _1 T by repeated use of the associative law. 

Ex 13* The even part of an even function is the function itself. 

CHAPTER 4 

hx 1* (DA)y — DikAkj 二 djAy，（ADljj — 〉: A^Ditj — Ajjdj. 

Ex 2, In most texts the proof is obscure* 

Ex 4* Choose B so that its range is the millspacc of A* blit the range of A is not the 
nuJlspace of B. 

CHAPTER 5 

Ex 1. P{p[op 2 (x)) = a(p^p 2 )P{x). Since /^p 2 (jc) = pj(p 2 W), 

P(P\°p2(y}) - P(pi{p2{x))) = <f{pi)P(p2(x ))； 

also 

P{pl(x)) ^ u{p 2 )P{x). 

Combining these identities yields (r(pi - P 2 ) — a(P\XP 2 ), 



282 


LINEAR AI.C1F.BRA AND ITS APPLICATIONS 


Ex 2, (c) The signature of (he transposition of two adjacent variables x k and x^y 
h - 1 . The transpnsilion of any two variables can he ohtnined by composing 
an odd number of interchanges of adjacent variables. The result follows from 
Exercise L 

(d) To factor p — 乂 ;; as a product p = 4 ， f| of transpositions, set 

I 2- • ， p! … n 

h — ~~^~t 

P\ 2 … 丨 .■屮 

1 2- - p2 … n 

h — 1 ^ i 

1 p 2 … 2，，，" 

iind so on 】 

Ex 3* Follows from (7) 力 . 

Ex 4* (iii) When at — . ,a tt — e n ，the only nonzero term on the right side in 

(16) is p — identity* 

(iv) When a\ and are interchanged, the right side of (16) can be written as 

where t is the transposition of 1 and j\ The result follows from 0*(10 p) — 
a(i)a(p) = -€r(p). 

Ex 5, Suppose two columns of A are equal, Then, by (iv). 


D(a,a) = 一 D (仏 a), 

so 2 D(^ ? «) = 0 . 

CHAPTER 6 

Ex 2* (a) All terms in (I4) r tend to zero* 

(b) Each component of A v h is a sum of exponential functions of N t with distinct 
positive exponents 

Ex 5, (25) is a special case of (26)，with q(a) = - The general case follows by 

combining relations (25) for various values of M 

Ex 7. For x in N ( f 

(A — al) d Ax = A(A- dV) d x = 0. 
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Ex 8, Lei p(s) be a polynomial of degree less than <h- Then some ei/ is a root of p 
of order less than dj. But then p(A) docs not map all of N 山 into () + 

Ex 12. h 二 （1.2)， 

{/ l? *i) = 3, (luh 2 ) = 0, 

(/i^O-O, (IjJn) - 3, 

CHAPTER 7 


Ex 1, According to the Schwarz inequality, (x^y) < |j ^ || for all unit vectors v. For 
y — x/|| x || equality holds. 

Ex 2. Let Y denote any subspace of X % x and z any pair of vectors in X Decompose 
them as 

x — y + j 1 , z — m + h 丄 , 

where j and u are in K w 1 and ff 1 orthogonal lo Y\ then 

Pa ^ y\ Pi: — u. 

P orthogonal projection into Y. 

( P ' 扣(允 H + " 丄）二(，“，)； 

{x,Pz) = (y + U) = (y\u). 

This proves that P is its own adjoint. 

Ex 3, Reflection across the plane = 0 maps 文 3 ) into The 

matrix representing this mapping h 

fi? n, 

\0 o -I / 

whose determinant is — 1. 

Ex 5. If the rows of M arc pairwise orthogonal unit vectors, then according to the 
rules of matrix rmiltiplicalion, MM V = L Since a right inverse is a left inverse as 
well, M' M — I; from this it follows from the rules of matrix multiplication that the 
columns of M arc pairwise orthogonal unit vectors. 
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Ex 6 ， dy — (At 。 - ， 句 )， By the Schwarz inequality 

Mtfl < II A 勺 llllej 

From the definition of norm 

II Ae y H<| A|i 11^0. 

Since || q jj ^ || €j || - l t wc deduce that 

hfls" ah. 

Ex 7* Lcl ^x n be an orthonormal basis for x Then any jr in X can be 

expressed as 

x =J^ a J x P 

and 

iu« 2 =En 2 . 

Wc can write A,!* as 

M E ajAx h 
so 

|1 Ar| S 1>| II 岣 II, 

Using Ihe classical Schwarz inequality yields 

|| Ai 的 I! 2 ， 

from which 

t|A|| 2 <J ]«^|| 2 

follows. 

Apply this inequality to A n — A in place of A to deduce that if (A„ — A)xj 
converges lo zero far all xj, so does [i A 7I - A |, 

Ex 8, According to identity (44), 


||, v + 7 |p^| U |p +2Re(A . >()+ || >t |p 
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^L'plaL'c v by fv, where t — — Ru(jc, 少 )/|| ;v ||灰 Since the ] ft si tie in ntmncgalivc* 
we gjet 

|Re(r t y)|<^!ll|y||* 

Replace Jt by iLr. \k\ = I. and choose k so that the lefr side is m^xiniized; we obtain 

l^y)\<\\^\\\\y\l 


Ex 14. For any mapping A, 

det A* = del A, 

For M unilnry, M'M = 1; by ihc mulliplicative property of deierminanh, 
del Prf* det M — det 1 — L 
Using det M" = del M we deduce 

(dctMl 1 = I. 

ek 17, ( AA， )-, = Uita * ki ^ H 叫如 =Z kd 2 ; 

i £ k 

trAA_ = (AA')y — hi' 

f fi. 

Ex 1% For A = (^|) P irA = 4, del A = 3, so the characieristic equalioni of A is 
d 2 - 4 a + 3 s 0, 

Jk rooti arc the eigenvaiues of A: the larger irmi is u - J. 

On the other hand, (d^f = I + 4 + 9 = 14 ： s/\A s 5,74. so by f46] and (51} 

3 < [| A H < 3.74, 

Fof ibe value of || A II t ,1.65, see Ex, 2 in Chapter 8, 

E\ 2iL (h SidrCe dtl (.r. v.j) is ii multilinear timcLiun uf t and y when I he other 
variables urc hdd lixed,, is a bilinear function of .r and v H 

(ii) ftjHows from det (.y,.r,z) — — ;) + 

(iii) is mie because det (jr.y.jr) = 0 and del (j, v, v) = 0. 

(It) Muliiply llic matrix (j,v. 4 ) by Rl s) — {fix. R\ h R;). 
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By the multiplicative property of determinants, and since det R = I, 

dct(x, y,z) = dct(RA\Rv\R-}; 

therefore 

(w(x 7 y),z) - (H<R^Ry) ， R:) = (R*w(Rx ， Ry ， z)), 


from which 

= R*u’(Rv. Ry) 

follows. Multiply both sides by R. 

(v) Take = ci( L O. O)^ v® — 6(cos^, sin 


Therefore 


/ a bco^9 z\ \ 
(xq x jq^z) = det 0 /j sin 19 z 2 
、0 0 Z3) 

=(o6 sin 6)^3 


jc 0 x >' 0 — ab sin 0(O : O T ] )\ 


Since« = ||x 0 | T * = || .Vo ||, 

||^)Xy 0 I = Xn\\\\yo 1! sin a. 

Any pair of vectors that make an angle & can be rotated into using (iv) 

wc deduce 

||-txy || 二 ||jr I 川 y I |sin 殊 


CHAPTER 8 

Ex 1* (a% M-t) = = (jc, M w x); 


Re (a Mx) - -(x, Mr) + - (x, M^) 


M + M" 

x ， 一 -— ^ 




SOLUTrONS OF SFLECTED RXF.Rf ISF.S 


287 


Ex 4, Multiply (24)^ by M on the right; since NTNT = I ， 

HM =DM. 

The jih column of the left side is where nij is the 7 th column of M, The yth 
column of the right side is djmj\ therefore 

Hw/ — Jjtnj. 


Ex 8 . Let a be an eigenvalue of M _I H, u an eigenvector 


Multiply both sides on the left by M T and take the inner product with fi: 

(Hu, u) = 


Since M is positive. 


(HfMi) ^ 
(Mu, u) 


This proves that a is real. 

Ex 10, A normal malrix N has a full set of orlhonormal eigenvectors/i, … ， f n : 

Wj = n /j. 

Any vector .1 can be expressed a 

A = II x ll~ = \ a j\ 2 ^ 

while 

Nx = a f¥j^ II Nv II 2 = EW 2 卜 /I 2 ; 

so 

I Nx II < max \nj\ || x \\, 

with equality holding for .1 = f m . \n m \ = max \nj\. This proves that 


N ||= max I 略 
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Ex 1L (b) An eigenvector of S，with eigenvalue v, satisfies 


fj—i ~ ^fjfj ~ 2 , ■ • ” 人 = vfi ， 


fi= v n ~ l f, = 


Therefore v is an /ith root of unity; 


exp^Jt, k 


Their scalar product, 


ifkJi) 


= !>p ( 字 心 ) ex p(v 


for k^i 


Ex 12. (i) A"A 


3 MO 


2 13 


The chimictcrislic equation of A 11 A is 


The larger rool is 


By Theorem 13, 


A 2 - 14A + 9 = 0, 


X miix — 7 + V40 ^ 13,224- 


A = 入 max 


(ii) This is consistent with the estimate obtained in Ex. 19 of Chapter 7; 


< || A || < 3 J4, 


Ex 13. 


1 0 


I 0 


2 2 
2 13 
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The characteristic polynomial of the matrix on the right is 


A" 一 1 5A + 22 = 0. 


whose larger root is 


15 + VT37 


13+35 


By Theorem 13. 


1 0 


VAjmiu — 3.65. 


CHAPTER 9 


Ex 2. Differentiate A 1 A = I using the product rule: 


5 a X a=o . 


Solve for 4 A -3 ; (3) results. 


Ex 3* Denote f ^ ^ 1 m C;C = I, so C JJ = C for n odd ， = I for n even. 


otpC = C( 1 … J + 


2 2 


• 17 1 


:; I) 


Ex A. For Y(/) = exp Ak 


^Y(f)-(ex P A0A, 丫 -_ 会 = A 


By formula (10) 


log detexp A/ = tr A 


so 


log det exp At = t ir A. 



2 舛 


LINEAR ALGEPKA AMD ITS APPLK ATIOKS 


Thus 


dctCKpAl = A). 


Ex 7+ According to Theorem 4 in Chapter 6 T for any potynomip] /), rhe 
ci^nvEilucM of ^i(A) arc of ihc form y?{rt)* a an ci^ctiv£i(tic ul' A. To extend this 
from polyncmiak Id the exponential Funt：[icn wc note thal e' is defined ihc 
liijiii of polynomials, that are detined hy formula (；32), T« qiimplcie ihc 
proof we apply Theorem 6. 

In Ex」6 wc hiivc shown thut del exp A - [: Jip(lr A); this indkalc^ that the 
mulliplicity of as an eigenvalue of lhi! 狀 lhe muEliplicily of a u*; an 

eigenvalue of A. 


CHAP1K.R 10 

K% 1, In formula (6) for %/H wic may lake ^ to (x? either rhe positive nr rwgative 
square mol. Thi^ shows that if H has ffdistiricu nonzero eigenvalues. H has T square 
roots. If fine of the nonzero eigenvalues of H has multiplicity greater then otic, H has 
infinitely muny square roots. 

A - {\ is posiLivt :： it map^U,U) iruo(J. 2). B Ss po^ilivei it nuipi 

(L0) inlo (l s -2). The vectors (} f 2) und (I, -2) mukc an An^k > jr/2 t 奶 AH + liA 
is nol ptBiitivc. rndccd + 


has one negalive eigenvalue. 

fcx 4. ⑷ Apply Therein 5 twice. 

(b) Apply Theorem 5 it rimes, 

where 2 k = hk 

(c) The limit 

- I] < m[N Wa - l] 


gives 


log M 5 lo E N- 
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Note, (b) remains true for all positive exponents > L 

Ex 5* Choose A and B as in Exercise 3, that is positive matrices whose 
symmetrized product is not posilive* Set 

M = A, N = A + fB t 

t a sufficiently small positive number Clearly, M < N. 

N 2 = A 2 Hr(AB4 BA) + tB 2 ; 

for t small the term i 2 B is negligible compared with the linear term* Therefore for I 
small N 2 is nor greater than M : . 


Ex 6 - We claim that the functions f(s) — — (s 4- /) _l , t positive, me monotone 
matrix functions. For if 0 < M < N, then 0 < M + tl <N + tl f and so by 
Theorem 2 t 


(M+rt) 1 > (N + rl )-、 

The function f(s) defined by (19) is the limit of linear combinations with positive 
coel'ficents of functions of the form \ and —(s + /)、/>(). The linear combinations 
of monotone matrix functions is monotone, and so are their limits. 

Note I. Locwncr also proved the converse of the theorem stated: Every monotone 
matrix function is of form (19). 

Note 2 - Every function/( L v) of form (19) can be extended to an analytic function 
into (he complex upper half plane Im s > 0* so that the imaginary part of f{s) is 
positive there, and zero on the positive real axis s > 0. According to a theorem of 
Hcrglotz, all such functions/ ㈤ can be represented in the form (19). 

It is easy to verify Lhat the functions 0 < m < I ， and the funclion log s have 
positive imaginary parts in the upper half plane. 


Ex 7. The matrix 


is a Gram matrix: 



I 

n -^n+ I f 


^ > 0 , 



Mt)fj(t)dr m = A 


Ex 10. By the Schwarz inequality, and the definition of the norm of M — N, 

(u, (M - N)u) < || w ]| || (M - N)ii || 
<||m|| 2 ||M-N|H./|| W |] 2 . 
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Therefore 

(u t Mu) < (u,N«) + rf|| w|| 2 -(w,(N + dl)u). 

This proves that M < N 4- dl: the other inequality follows by inter changing the role 
of M and N. 

Ex 11, Arrange the nij in increasing order: 

m I < * * < 叫 . 

Suppose the rti are not in increasing order, that is lhat for a pair of indices i < j, 
rii > Hj, Wc claim lhat inlcrchanging and increases the sum (5 1); 

ttitni + njftij < rijifii + nimj. 

For rewrite this inequality as 

{«/ - n/)mi + (nj - ni)mj 
— (m — nj)(nii — ttij) < 0 , 

which is manilestly true. A finite number of interchanges show% thai (51) is 
maximized when the n, and mi are arranged in the same order. 

Ex 12. If Z were not invertible, it would have zero as an eigenvalue, contradicting 
Theorem 20. 

Let h be any vector: denote Z" 1 by L Then 

(Z 1 *,*) = (LZk); 

Since the self-adjoint pari of Z is positive, the righl side above is positive. But then 
so is the left side，which proves that the self-adjoint part of Z 1 is positive. 

Ex 13* When A is invertible, AA" and A*A are similar: 

A 'kKk 、 

and therefore have the same eigenvalues* Noninvcrtible A can be oblaincd as the 
limit of a sequence of invertible matrices. 

Ex 14. Let u be an eigenvector of A^A, with nonzero eigenvalue; 


A + Aw = ru t r ^ 0, 
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Denote Au as v: the vector v h nonzero, for if Au — 0, it follows from the above 
relation that u — 0. 

Let A act on the above relation: 

AA ， Ah = rAu 

which can be rewritten as 

AA*v = rv; 

which shows that v is an eigenvector of AA\ with eigenvalue t ： 

A maps the cigcnspacc of A* A with eigenvalue r into the cigcnspacc of AA"; this 
mapping is Mo-h Similarly maps the eigenspace of AA r into ihe eigenspace of 
A in a Mo-1 fashion. This proves that these eigenspaces have the same 
dimension* 

Ex IS. Take Z : ,a some real number; its eigenvalues are 1 and 2. But 

Z + Z* ^ 

is not positive when a > \/8* 

CHAPTER 11 

Ex U If M, = AM. M; = JVTA + = -MA, 

Then 

(NTM), = M；M + M 4 M t = «M*AM + NT AM = 0. 

Since NTM = I al f = 0, M'M — I for all L At t — 0, detM — L therefore del 
M = 1 for all t This proves that M(t) is a rotation* 

Ex 5, The nonzero eigenvalues of a real anlisymmctric matrix A arc pure imaginary 
and come in conjugate pairs ik and 一 ik t The eigenvalues of hr are 0, 一 A」* —t*, so 
tr A 2 = —2k 2 , The diagonal entries of A 2 are —{cr + h 2 )^ - (a 2 + c 2 ) and 
— (b 1 + c 1 ), so lr A 2 = — 2(fr + h 2 + c 1 ). Therefore k = + b 2 + c 2 . 

Ex 6. The eigenvalues of are 〆， where o are ihc eigenvalues of A. Since the 
eigenvalues of A are 0, 士认 , the eigenvalues of ^ Al are l and e 恤 * From A/ = 0 we 
deduce that e^f =/; thus / is The axis of the rotation e At . The trace of is 
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1 + 十 e _ikt = 2 cos A/ 十 L According to formula (4)\ the angle of rotation $ of 

M — satisfies 2 cos 0 + l — tr e At . This shows that 9 ^ kt = s/a 1 + b 2 + c 2 L 


㈣ ， A= 二 

\-h 


their null vectors are 


a 

0 

-c 



d e\ 

0 ^ ； 

1 ()/ 



f ad 七 be bg 

AB = - ce ad + eg 
\ —cd bd 


-ag 

ae 

be + eg 


Therefore tr AB ; ^2{nd + be cg) % whereas the scalar product of f A and / B is 
eg + he + ad i 


Ex 9* BA can be calculated like AB, given above. Subtracting we get 

0 ec — bg —dc + ag 
一 BA = i bg — ec 0 Jb — tie 
dc — ag ae - dh 0 


Therefore 


/ [A.BJ = 



ae — dh 
a 只一 dc 
bg — ec 



We can verify that j\ k/r —by using the formula for the cross product in 
Chapter 7. 


CHAPTER 12 

Ex 2. (a) Let {M} be a collection of convex sets, denote Lheir intersection by K. ll\v 

and v belong to /f* they belong to every Since is convex, it contains ihc line 
segment with endpoints x and v. Since this line segmem belongs to all K i3 it belongs 
to K. This shows lhat K is convex* 

(b) Let x and y be two points in H + K; that means that they arc of the form 


x = u + i, y = v + w\ i4 and v in H. z and w in 
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Since H is convex, 仙十 （1 一咖 belongs to H. and since K is convex 
a Z + (I — 沒)嗽 belongs to H for 0 < a < L But then their sum 

+ {1 — fl)v + az + (1 — — a(it + z) + (I — fl)(v + w) ^ ar + (I — 

belongs to H + K. This proves that H + K is convex. 

Ex 6 ， Denote {u, i>) as x If both u and v are < 0, then x/r belongs io K for any 
positive r ? no matrer how small. So ― 0. 

If 0 < ^ and u < v, then x/r — belongs to K for r >% but no sinullcr k 
Therefore = t，. We can argue similarly in the remaining case* 

Ex 7. If p(,r) < 1» p(_y) < L and 0 < a < 1 1 then by sub-additivity and 
homogeneity nf p, 

p(ox + (l - a)y) < p(ax)+p{( l - a)y) = ap(x) + (1 — a)p(y) < 1 - 

This shows that the sel ofx : /?(.r) < 1 h convex. 

To show that the set p(x) < I is open we argue as follows. By subaddilivily and 
positive homogeneity 

p(x + ty) < p{x) + p(ty) = p(x) + fp(y). 

Since />(jr) < Up( 太 ) + < 1 for all t positive but small enough. 

Ex 8. 

q${m h /) = sup (//I + l)(x) 

a in 5 

=sup ( 川 ㈤ + /(x)) < supm(x) + sup/(. t) - q s {fn) + q s {l)* 

^irif in S x in S 

Note, This is :i special case of the result that die supremum of linear functions is 
subadditivc. 

Ex 10. 

(/) ^ i(r) 

%UT^ SU P 
A'inSUT 

— max{sLip ( sup 中 ) } = max 切 

jrioS jrin? 

Ex 16, Suppose all /;； are positive. Define 

I k 

Y,Pj x ^JlPr 

i ] 


Yk 
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Then 


Ihen 

+ 士 h + 备 PfS = W 

E\ 20. A sel S in Euclidean space is open if fbr every point in S there is a ball 
I J _ 太 II < e centered at x that belongs to S* Suppose S is convex and open: that 
means that there exist positive numbers e.^ such lhat .r + te t belongs to S for \t\ < e ,； 
here ei denotes the unit vectors. Denote min C| by e: it follows thal the points 
.v ± €eiJ = 1” … vi belong to S, Since S is convex, ihe convex hull ol' these points 
belongs lo S: this convex hull contains a ball of radius e/\/2 centered at x. 

The converse is obvious. 


Similarly define 


Then 




^PJ 

j_ 

TTT 

Sp/ 

i 


•Vft . 

^PJ 

i 


We claim that alt points va belong to the convex set containing x “ …，; This 
follows inductively，since Vj — ,Y|, and va+i lie on the line segment whose endpoints 
arc Vi and Finally, 


)’m = Pi x j， 


Ex 19. Denote by Pv.Pi, the following 3x3 permutation matrices 
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CHAPTER 13 

Ex 3. We claim that the sign of equality holds in (21). For if not, 5 would be 
larger than s 、contrary to (20); this shows that the supremum in (16) is a 
maximum. 

Replacing K v* and j by —Y f —y, —j turns the sup problem into inf, and vice versa. 
This showB that the In f in (18) is a mininiuni. 


CHAPTER 14 

Ex 2, x z = (.v y) +- (y z )； apply the subadditive rule (l) iP 
Ex 5, From the definiuon of the ㈤ 厂 and norms we see that 


1^1^ < l- v l^ < "K 

Take lhe /;th root: 

< 1^1^ < n } fp \x\^ 

Since n ]fP tends to I as p tends to do, IxL — lim |.vl M follows. 

°° jP 一 DC、, 

Ex 6. Introduce ii basis and represenl the poinls by arrays ol real numbers. Since all 
norms are equivalent, it suffices to prove completeness in the |x [艽 norm* 

Let {jt" } be a convergent sequence in the |jr| x norm. Denote the components of x tl 
by x ti j. Il follows from \x n — x m | maJt —^ 0 that 



for every / Since the real numbers arc complete, it follows that 


lim x flJ = .v y . 




Denote by Jr the vector with components xf it follows thai 

lim - x\^ - 0. 


For another proof see Theorem 3. 
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CHAPTER 15 

E\ L According to (i), Tj:| < r|x|. Apply this to |T(.r fl — x)\ < clx,, — x\. 
Ex 3, We have shown above that 


(I — Rjr 1 = i - r m+1 

Multiply both sides on the left by S _1 : 

T ] = S_ 】 —S^R 朴 1 

Therefore 

ir* -s _, | < |8 _| ^ +1 | < |5 _1 |||^ +| |. 

Since |R < 1. |R J,+ I | < R " 1 lends to zero as n — oo. 

Ex 5, Decompose n modulo m\ 


n ^ km +■ r, 


0 < r < m. 


Tlien 


therefore 


- (R w /R r ? 
|R"| < |R”f |R r | 


as n tends to do, so does k: therefore if \R m < K \R ,J \ tends to zero as n tends to do. 
That is all lhal was used in the proof in Ex, 3, lhat T ,J tends in iionn U> S 一 】. 

Ex 6, The components of y — Tx are 


Since \xj\ < K 




Lvil — 


So Moo ^ maX T,j 1/rillXl^； (23) follows 
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CHAPTER 16 

Ex 1. What has to be shown is that if Px < kx, then a > k(P). To see this consider 
P T , the transpose of P: it, loo, has positive entries, so by Theorem 1 it has a dominant 
eigenvalue X(/^ } and a corresponding nonnegative eigenvector k: 

P T k = k(P T )k. 

Take the scalar product of Px < kx with k; since k is a nonnegative vector 

(Px, k) < k{x y k) 

The left side equals (,v, P T k) — Since a and k are nonnegative vectors, 

(a:,*) is positive, and we conclude that k{P r ) < A, But P and P T have the same 
eigenvalues, and therefore ihe largest eigenvalue 入(尸） of P equals ihe largest 
eigenvalue X(P r ) of P r , 

Ex 2, Denote by " ihc dominant eigenvalue of and by k the associated 
eigenvector: 

P m k = fik 、 

Lei P act on this reiaiion: 

r H l k ^ rpk - uPk, 

which shows that Pic 、 too, is an eigenvector of P m with eigenvalue /i. Since the 
dominant eigenvalue has multiplicity one* Pk — ck. Repeated application of jP shows 
that f^'k — (f n L Therefore c m ^ ft. From Pk = ck we deduce that c is real and 
positive; therefore it is the real root Since the entries of k are positive, 

c — ft ]；m is the dominant eigenvalue of P. 
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APPENDIX 】 


Special Determinants 


Then: urc sonwr tlisscji uf miilriccsi whose dclcnniminis cun bi ： expressed by ccHnpEK：i 
ii]gd>ruic formulas.. We give iTUcresling cxmttti|1Ics. 


Dejirtitioit. A VilftftfnTmikk，ntafrix is a Mjiiure mulrix who^tf catumns form a 

geiomclric prugneraion, Thai is. lei dg._ q u be n SL'aliir^^ then V(fl| ^, a„) is Ihc 

matrix 


V(£ij - iO|iJ ^ I ■ 

u_ 



⑴ 


Hmuftiti 1 

del V(d.d„) = J" [(今 一 印}， （2) 

Pmof. Using fimnul^ (16) of Chupler 5 the Jelcrminanl, we uiomcluJc ihal 
det V is a polynomial In the ^ of degree less lhan or equal to n(tt 1)/2. Whenever 
two t>f the stdlurs a r simJ a } , i ^} t are equal h V has two equal columns and so ib 
dclcrmirEinL is zero ： ihcncfniic, according tn the factor theorem cif algchra, del V is 
divisible hy a ； - af. Ji follows thai del V is divi^ihte hy ihe product 

n(4 _ 叫)- 


linear A {辦 bm tmd /f.v Appticaiimf^ Setmul Edilkm^ by Peter D. Lux 
Ccipyri^ht I 21X)7 John Wiley &. Srni 、 Ini：. 
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This product has degree n(tt - 1)/2, the same as the degree of det V. Therefore 

dctV = a；), (2)’ 

j>i 

c n a constant We claim that c n — 1; to see this we use the Laplace expansion (26) 
of Chapter 5 for del V with respect lo the Iasi column, lhal — jl We get in this 
way an expansion of det V in powers of a n : ihe coeflicient of is 

det V(«|,.. * On the other hand, the cocfficictu of «JJ _1 on the right of (2/ 

is (aj - di). Using expression (2/ for V(di ? .., we deduce lhat 

c, s = An explicit calculation shows that cj = \ : hence by induction c tt = I for 
all ii, and (2) follows, □ 

Definition. Let a* t ,,,, a n and b[^ r ,^b n be 2n seal am. The Cauchy matrix 
C(a \,... ,^ #f ; ,. * *, b ti ) is the n x n matrix whose ijih element is I /(a! + aj): 

Theorem 2, 

dctC(a,b) — 


UL 厂 W 
0,J + bj) 


Proof. Using formula f 16) of Chapter 5 lor Ihe determinant of C(ch h) % and using 
the common denominator for all terms we can write 


detC(rt. b) 


P(a,h) 




(4) 


where P(a y b) is a polynomial whose degree h less than or equal to n 2 — n. 
Whenever two of the scalars ctj and cij iwc equal, the /th and jih row of C(a, b) are 
equal; likewise, when hi = bj, the /th and ylh column of C(a.h) are equal. In either 
case, [ieiC(a.b) = 0; therefore, by the factor ihcorem of algebra, ihe polynomial 
Pia^b) is divisible by (aj — (^) and by (bj — bi )， and therefore by the product 


\\(^j - m)(bj ~ bi). 


The degree of ihis producl is /r — ihe same as the degree of P; therefore, 

n^- 仿 ) 汍 —k), (4) y 
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c„ d cunslanl. Wu tiaim [h^it c„ r I; lu kcs? ihis wc use Ihc LupSiicc expunsiun ftir 
C(tf.fi} wiih respect to the last cotumn.j = ts: the term corresponding lo the eleFiKiu 
+A fl ) is 

JelC(ti]. ti n -ubu .. ■，办 fl-t) — ■ 

Now sei fi H = b lt = d: we get from (4) and (4^ that 

C(d| n - 3 . d) 

_ u n Jr >/ ^ ~ - {°j - a i)i b ) . 

FLi>i W + a *)(^ + ^r) rLjo.( a * + *i) 

Fmm tti€ Laplace expansion we gei 

C(wj. tt]bi … • 'd). 

=~C{oi.%-■: _fr„_i) + other tenrn 

Mulliply both, expressions by 2d and set d ==(h using (Af to exprr&s 

… ， On- ，： 為 i. fj„), we deduce llrni c n = c n ~i ，An e>tplicil calcylation 

shows thau ) 二 I P so we conclude by induction that t „ = 1 for all u; (3) now follows 
from (4) and (4)^. □ 

Nate- Eveiy minor of a Cauchy matrix is a Cfluchy matrix, 

Exehcise i. Lcl 

+ … + J〆 -1 

be a polynumia! oftiegrec less thun /k Lcl «|...., a, t be n dislincl numbers and leL 

p\ . p„ he pi arbhnry ciimplm numbers ： wt wish u> chuusc ihu c«E3tlH ： ii?iiis 

Jh … ， jr fl so that 

pia,} = pi ， i = 

This is a iys-tem of n linear equation^ for the n coefficients jfj. Find the rnotri\, of (Kis 
system of atid show that its deiertnintml h ^ 0. 

E\RRnst ： ； h Find sin iilgcbraic; rnmtulu For the ddtcmiiTiaTil of ihe n^lri 其 wheyse 
//th element is 

I 

t 十 OiQj ' 

here q_ a„ are arbiltary scaim 
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The Pfaffian 


Let A be an « x #i antisymmetric matrix: 

A r 〜A. 

We have seen in Chapter 5 that a matrix and its transpose have the same determinant. 
We have also seen that the determinant of 一 A is (- 1) IT del A sa 

det A = dot A r det(—A) — (—I) w det A. 

When n is odd. U follows that det A = 0; what can we say about the even case? 

Suppose the entries of A arc real; then the eigenvalues come in complex 
conjugate pairs. On the other hand, according lo the spectral iheory of anti-self- 
adjoint matrices, the eigenvalues of A are purely imaginary. It follows that the 

Tl 

eigenvalues of A are (一认 iAj” ■ ■ ，认 界/ 2 ) - Their product is (ITa,) , a 
positive numher; since the determinani of a inairix is the product of its eigenvalues, 
we conclude that the determinant of an antisymmelric matrix of even order with real 
entries is nonnegaiivc. 

Fur more is true: 

Theorem of Cayley. The deierminurit of m antisymmelric matrix A of 
even order is the square of a homogeneous polynomial of degree n/2 in the 
entries of A: 


dctA^ P 2 . 


P is called the Pfaffian. 


Linear Algebra and Its Appticmiims, Second Edinon, by Peter D. Lax 
Copyright i 2(X)7 Jnhn Wiley & Son%, Inc. 


305 



3ft6 


APPENDIX 2 ： THB PFAFFIAN 


Exercise r* Verify by a calculation Cayley s theorem for n = 

Proof. The proof is based on the following lemma. □ 

Lemma 1. There is a matrix C whose entries are polynomials in the entries of 
the antisymmetric matrix A such that 

B ^ CAC r (1) 

is antisymmetric and tridiagonal, that is， — 0 fbr |/«y| > 1. Furthermore, 
det C 一 0 . 

Proof. Wc construct C as a product 


C = Q 卜 2 … CsA. 


C| is required to have the following properties: 


(i) Bi = C| ACf has zeros for the last (tt - 2) entries in its first column. 

(ii) The first row of C| h ei = (IJ),... f 0), its first column is 

It follows from (ii) that CI maps ef into therefore the first column of B], t 
is C| ACf^[ = C\Ae^ = C\o, where a denotes the firsi column of A, To satisfy (i) 
we have lo choose ihe rest ol C| so ihal the Iasi — 2) emries oTCi^ are zero, Thi 只 
requires the last n~ 2 rows of Cj to be orthogonal to a. This is easily accomplished: 
set the second row of Ci equal to ^2 = ( 0 , l ， 0 ， .. ，， 0 ) the third row ( 0 了沒 3 ，一 £ 12 , 

0 , ■ •, ， 0 ), the fourth row ( 0 、 0 , —^ 3,0 .(>)，and so on, whore rt| ” " ， arc ihc 

entries of the vector a. Clearly 


detCj = £ 1 两 … a , 卜 I 


m a nonzero polynomial. 

We proceed recursively; we construct Cj so its first two rows are e\ and c j 2 , its first 
two columns e\ and cl- Then ihe first column of B 2 = QBiCf has zero for its last 
7 i — 2 entires. As before, wc fill in the rest of C 2 so that the second column of B 2 has 
zeros for its last n — 3 entries. Clearly, detC^ is 11 nonzero polynomial. 

After (n — 2) steps we end with C = C n -2 ■ + _ c 1 ， having the property that 
B — CAC r has zero entries below the first subdiagonal，that is, — 0 for i > j + \ . 
B r = CA r C r = — B, that is, B is antisymmetric. Ft follows that its only nonzero 
entries lie on the sub and super diagonals j = i + I* Since B r = —B, / 々 ,„■= 一 bu+i. 
Furthermore, by consiruction, 


det C = det C[ ^ 0. 


□ 


( 2 ) 
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Whai is the derermioam of an antisymmetric, tridiagonal matrix of even order? 
Consider the 4x4 case 


/ 0 a 0 0 \ 

—a 0 b 0 

B= ! 0 -h 0 c ' 

\ 0 0 -c 0/ 

Its determinant det B = a 2 c 2 is the square of a single product. The same is true in 
general: the determinant of a tridiagonal antisymmetric matrix B of even order is the 
square of a single product, 


deiB= (n — 务 1 )' (3) 

Using the moltiplicativc property of determinants, and that det C T — de[ C t we 
deduce fioni (1) that 


detB = (det C) 2 det A; 

combining this with (2) and (3) we deduce that det A is the square of a rational 
function in the entries of A. To conclude we need therefore Lemma 2. 

Lemmu 2. If a polynomial P in n variables is the square of a rational function 
/f* /? is a polynomial, 

Proof. For funcrions of one variable this follows by elementary algebra; so we 
can conclude that for each fixed variable x, /? is a polynomial in a, with coefficients 
from the field of rational IVnclions in the remaining variables. It follows that there 
exists a k such that the kxh partial derivative of R with respect to any of the variables 
is zero. From this it is easy to deduce, by iniiuctioiion the mi in her of variables, ihm R 
is a polynomial in all variables* □ 



Apptying 6J. wc finally ronrliicir llml 


f IA4 n (A u + 5 f)| dy - vola-i AJ* 

Jn 

Head bp! 僻 (ti-2-3) follows. Now wp olm，rw H;tM 

“g ™ ■“吣 < i 

by (6.2.2). Therefore, tke average valiio of tlie miinlnT d iwtiitH in {3/\it/,j)n Aj- b 
stTktly HinalliT Itian 1. Thwefore. tlww stmsl hv tyi j e 11 Hiu h that thi- intersection 
(A/ \ jV/o) fi A x is empty. Bv (6 r 2.I) wf t!iat H A, ronnists t?f at most 

ibn mt<3 v*ct*r, which wmp3ct«t ilie proof. □ 

PROBLEMS, 

1. Let 诊 lx? a LebesgLK! hikgmhlt- fjiu L iiDn on IR^ aii«l k p t A c be a latlim 
Prow that iherp exitits a : £ R d Hllc}i th»t 

u^A 

2 - Let ^bea boundwl R-icmaim iTitfKrabk 3 r»nr( ion vaiiiKliiiiE uut^Kl^ A iM'wi^krd 

r€j;ian in St d „ > 1, and let f In* ft positive? number. that thm! fxiistii h mii- 

imudiuLar lattice A C sturh th^t 

冗 fli ㈤ < e -h / dx 

«€A \ 叫 ^ 

3 - L(fl M C R rf , (i > 1, be a bounded c^ntridly symmtnrk ,Jonl 啡 int!nsnriil>k 

set and let & > (l/2)(vol ,W). Provo that there cjrisits 4 lallicx. 1 A C !R rf swdl iJml 
det A = ^ and M does rtut any lattice point, estci-pt powiblv 0, 

Hint: Either Af docs not contnin any iw«r^oro lattice [xihit il ««iliuisA al 

k 祕 I ： it 鄉 . 

(6.3) Corollary, F»r anu ft < 2~ 4 therv exists a d-dimenmonal iaitin A C 
wfwsf. packing density in at itarf a. Similarly, /o^ nTijf 

^< T -^^ f ( 1+0 & 

there m.tts a dimensional taUifr. A C R 41 wifJf radim nl tnuit p dfirf 

l. 



APPENDIX 3 ： SYMPI.RCTIC MATRICES 


309 


for all x and _y, satisfies 

S r JS = J (4) 

and conversely* 

Pmof, (SxJSj) = (jc, S 7 JS>*). If this is equal to (i, Jy) for all x, )» ， S r SJv = J v 
for all y\ □ 

A real matrix S that satisfies (4) is called a sympteenc matrix. The set of all 
symplectic matrices is denoted as Sp(n), □. 

Theorem 2. (i) Symplectic matrices form a group under matrix multiplication. 

(ii) If S is symplectic% so is its transpose S 7 * 

(iii) A symplectic matrix S similar to its inverse S—' 

Proof (i) It follows from (4) that every symplectic matrix is invertible. That 
they form a group follows from (3). To verify (ii)，take ihe inverse of (4); using (2) 
wc get 

S- t i(S T )~ ] ^ L 

Multiplying by S on the lei、. S 1 on ihe righl shows that S r satisfies (4). 

To deduce (iii) multiply (4) by S _1 on the right and J^ 1 on the left. We get that 
J _l S r J = S _1 , that is, that S _1 is similar to S r . Since S T is similar to S, (iii) 
follows. □ 

Theorem 3* Let S ⑴ be a differentiable function of the real variable ^ whose 
values arc symplectic matrices. Define G(0 by 

卷 S = GS. (5) 

dt 

Then G k of the form 

G = JL{r) T L self adjoint, (6) 

Conversely, if S(/) satisfies (5) and (6) and S((}) is symplectic, then S ⑺ is a family of 
symplcclic malriccs. 

Proof For each t (4) is satisfied; differentiate it with respect to t: 

G s X s=a 
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Mulliply by S" 1 tm ihc righl + (S r ) 1 tm ihc left: 

(S r r l jS r J+J ( 盖 S)S- a (7) 

We list {5) tD Jdinc G: 



Taking lhe transpose we gel 

G 、 (S r )- ■苳 S r . 

ill 

Salting these into (7) glve^ 

G 7 J+JG = 0, 

from which (6) follows, 口 

Exercise 2 . Prove the converse. 

We turn now 10 the spectrum ofas^vmpleclk niLitrix S. Since S is real, iis complex. 
cig£：iivuliits L'onrtt in coitjuguu ： paicx ihat is k iT A is an cigcnvutiii： h fitv h A. Atcordihg 
to part(ui) of Theofcm 2, S and S 1 anq similar ： spnee similar matrices have I be same 
spectrum^ il follows ihm if A m eigenviilue of S. so is A 1 amHl the ^iime 
nmjltjpiidty. Thus 穿 he eigenvalues of a syraplettic matriK S coiinc in groups of four: 
入 X t X~ 1 A~ 1 T with three exceptions: 

(a| When A lies on the unil circle, that is. |A| = E, ihen A _r = A, so we only 
have ii ^Tmilp uf lWo 4 

(b) When A is rcilL 乂： A, so wc only have a ^nuup of Iwo, 

(C) A — t or —1. 

The possibility is slid open rhai A = 士 1 .an? simple eigenvalues af S; but ihis 
cannot occur according to 

Theorem 4. For a symplecEie matris S, A = I or ~ 1 cunnot he a simple 
eigenvalue. 

Pmof. Wc ur^uc indirectly: suppose, seiv, thut A — —1 is it simple cri^unviilue, 
wiih eigenveclor A: 


S/e = —h. 


m 
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Multiplying both sides by S f i and using (4) we get 

lh= -S r ih, ( 8 / 

•Ts 

which shows that ih is eigenvector of S with eigenvalue — 1 , 

We chuose any self-adjoint, positive matrix L T and set G = JL, We define the one- 
parameler family of matrices S (0 as c^ G S; it sat is lies 

^S(/)-GS(f), S(0) = S. (9) 

According to Theorem 3, SO) is symplcctic for all /. 

If S( 0 ) has — I as eigenvalue of multiplicity one, then for t small, S(/) has a single 
eigenvalue near — 1 . This eigenvalue A equals — l t for otherwise A -1 would be 
another eigenvalue near —1. According to Theorem 8 of Chapter 9, Lhe eigenvector 
h{t) is a differentiable function of Differentiating Sh - —A yields 

f — S ] /i + Shj = —h t . h t = —h. 

\dt } dt 

Using (9) and (8) we get 

G/i = h t + Sft 卜 

Form the scalar product with Jh: using (S) f we get 

(Chjh) = (h r Jh) + {Sh t Jlt)^ (h t Jh) + (h t ,S r Jh) 

= {h t Jh) - (fyJh) = 0. () 

According to (6), G — JL; sel this into (10); since by (2), J J J — I f we have 

(JLhJh) = (Lh.fJh) = (LA ， /r) = 0, (10)' 

Since L was chosen lo be self-adjoint and positive, /i = 0, a contradiction, □ 

Exercise 3 . Prove lhat plus or minus 1 cannot be an eigenvalue of odd 
multiplicity of a symplectic matrix. 

Taking the determinant of (4), using the multiplicative property, and that 
detS r = detS we deduce that (detS ) 2 — 1 so that detS — ] or -i. More is 


true. 
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Theorem 5* The determinant of a symplectic matrix S is I . 

Proof. Since we already know that (det S)" — 1 , we only hn\e to exclude the 
possibility that detS is ncgtitivc. The dclermincmt of a matrix is the product of ils 
eigenvalues. The complex eigenvalues come in conjugate pairs; their product is 
positive. The real eigenvalues / 1 . —1 come in pairs A, A** 1 . ;uul their product is 
positive. According to Exercise 3 ， 一 1 is an eigenvalue of even multiplicity ； so the 
product of the eigenvalues is positive, □ 

We remark that it can be shown that the space Sp{n) of symplectic matrices is 
■ " 1 
connected. Since (detS 广 =1 and since S = I has determinant 1, U follows that 

det S = 1 for all S in Sp(n} r 

Symplectic matrices first appeared in Hamiltonian mechanics, governed by 
equations of the form 

J f ^ - CM) 

where “ （ f) lies in H is some smooth function in R 2 ” ， and H {t is its gradient. 

Definition. A nonlinear mapping u —> v is called a canonical transrornicUion if 
Us Jacobian matrix Ov/du is syroplcctic. 

Theorem 6 , A canoniciil inmslbmiiuicm changes every Hamilionian equation 
(11) into another equation of Hamiltonian form: 

d w 

了 v = JK^, 
dt 


where K(v(u)) = H(u) m 


Exercise 4 , Verify Thenrem 6 . 



APPENDIX 4 


Tensor Product 


For an analyst, a good way lo think of ihe tensor product of two linear spaces is lo 
take one space as the space of polynomials in x of degree less than tu Ihe other as 
the polynomials in y of degree less than m. Their tensor product is the space of 
polynomials in .v and of degree less than n in x, less than m in y. A natural basis for 
polynomials are the powers 1 ”r， ，… f and l^y\ ,,., ■ respectively; a natural 

basis for polynomials in .t* and y m , i < n ? j < m. 

This sets the stage for defining the tensor product of two linear spaces U und Vas 
follows: Lei {e，} be ii basts of the linear space U, {fj} a basis for ihe linear space V. 
Then {ei ®fj} is a basis lor their tensor product U 
It follows from this definition that 


dim U®V = (dim {/)(dim V}, (I) 

The definition, however is ugly, since it uses basis vectors, 

Exkucish i. Establish a natural isomorphism between tensor products defined 
with respect to two pairs of distinct bases. 

Happily, wc can define f/ ® V in an invariant manner 
Take the collection of all formal sums 

Ul ® ( 2 ) 

where u t and ^ are arbitrary vectors in U and V, respectively* Clearly, these sums 
form a linear space. 

Sums of ihe form 


(u\ + Ul) ® V - U\ (g3 V - ti2 0 v 


⑶ 
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and 

« ® + vj) — « ® Vf - m® (3/ 

iln; special tascs of Thcs4；, und ull ihcir iineuf cuinbinations, are culled nutl xmnx. 
Using these conoepts we can give a basis-free definition tensor pimxtuct- 

DeJimtiott r The tensor product U V of two (lnite-tlimensionol linear spaces U 
and V is ihe qmHicni \pace oi' ihc spact cil'ull lurrmi! Curtis (2> minlulo ull mill sums 

oi, ay. 

ThifidctiniliLm is btiNis-lroc, bul^ lilllc awiward. Htippily. [hire is Ltn dc^unT way 
of picsentin|! it 

(lieorem 1. Tlicrt ； is, a iiaiuml iKoitiuiphism bemeen U ( ^ V ddinctl above 
jnti \ 1/), ihe spate uf all linear mappingfi inl» V, where U T i& ihc duiil of U. 

Pnmfr Let ® ^ he a reprewntattve of an equivalence class in the quotkiu 
spiice. E 7 or any / in f/, ashiigii to / ihc linage in V. Since tvery mil] sum i^; 

mupped imo (ero r ihis niapping depends onl^ an thu equivylenc? duss., 

The mappm£ L, 

[““ 如 ‘ 

is dearly linear arwi ihe alignment 

I ^ ® v,, | -* L (4) 

also is linear. !t is tiol hand to show that every Lin V) is the image of some 
vet (or ia (J®V. Q 

ExtRnsE 2 . Verify ■ hat (4) maps £/ ® V onto ^(f/, V}. 

Thcnncm } treals Lhc spaces U and V usymrncTrically. The rricsi of U and V can be 
Enicnchangcd, leading to an iiiomnrphism of y ® V and -5^( V\ U), The dual of a map 
L: C — V is of course a map V. V* U. 

When U and ^are equipped wilh real Euditlcun smic[ufC H there is a. natural way 
to equip U with Euclidean structure. As before, there arc: two ways irf firing 
ahnul il. One is m ckiow orthtmmmi hases {(Pr}i \fj) *n U and V respectivefy, and 
declare {ej o fj\ to be m urthonuntia) basis for U^V.il remains to be shown thal 
(his Euclidean sirucutu is indcpvmJ^ni i}f the choice cil the {>rihi}numiul bases ： this is 
easily done, based on the following kmma ， 



(h ® V .; ® w) = (tf, w). 


( 5 ) 


Proof. tEKpiind cr ; in terms of ihe f, . r mid w id temis of fj. 

u = z= Yl 

v= H £ V//» w = X] di ^' 

Then 

ii ® ^ ® fh j ® iv = bi( 山 et ® 力； 

(w ® j (gm>) = atcjttdj 

=(E*?A) (X W) = ( 叫如，痄 )■ □ 

Take the example presentedflt ihq beginning, where Vis thespac^eof pnlynnmia^ 
in ,r of degree < mnd V h ihe space of polynomials in v of degree <iw. Dertne in U 
I he square integral over an jr-inferva! A os the Euclidean norm, and in V the sqiicure 
ink-gml «vcr u y-inlt-rvuJ B. Then ihc Buclidcun siriictiirc in i/ S defined by (5) is 
rhe square iniegral over the reciang^e A x B- 

We show ruvi* how to use ihe repre^nialion of U ^ V as V) to derive ihe 
EuL'lideun setuciuil' in t/ ® V fixjm the Eudidcan iiructun: ul' U and V. Hen: 
^ = U T soU ® Vis^(U.V), 

Lei M and L belong to iP(U, V). and let L* be the adjoint of L Wc define 

(M,L) = (jrL*M 」 (6) 

Clearly thk depends bilincitrly ori M and L. In lenns of orthonnrmnl bases, M mv! L 
can be expressed as matrices (ii? y ) and (^) 1 and L + as Lhc tran.spoM ； (^). Then 

trL'M — 

Setting L = M, wc get 

|| M I" = [MM) = 

consistent with our previous definition. 

Ccmplcx Euclidean structures can be handled Lhc %amc way. 






Schufs peculiar theorem fmm Chapter 10. Thewenn 7 ： if A = (Ay) aod B = (B^) 
are positive sytnmciric n x n m^riccs then so h ihcir enljy by entry product 


Pmof. It was observed in Theorem 6 of Chapter ] 0 that every positive symmetric 
mairix can be wtillcn a Gram mairix: 

Ki = (Ui,uj), Ui c i/ Jirtenrlj 1 independent. 

B^- — (vj f vy C V + linearly indepemlciit. 

Kow define jjj in U ® V ui t>e Wj ® iv ； by <5} F = A,jBy, 

This shows that M is a Gram matrix, ihcrefore nanncgalivc. □ 

Exerdse 3. Shnw lhat if {iii} and {iv,} uni! linujrly indcpcndcni, mi are «r : v,i, 
Show ihai Mij is posiiive, 

Exerctisf 4. [j3t u be a twice difTenqntiahle function pf jc ] + . “ ，& dclined in a 
neighborht«xi nf u poinl/J，where tt has a Uital mEniimim, Let (A v ) be y symmelric,, 
non negative matrix. Show thm 

[〜兹 ㈣ 
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Lattices 


Definition. A lattice is a subset L of a linear space X over the reals with the 
following properiies: 

(i) L is closed under addition and subtract ion; lhal is, if x and v belong to L, so 
do x 4- y and x — y\ 

(ii) L is discrete, in the sense that any boimded (as measured in any norm) set of 
X contains only u finite number of points of L. 

An example of a lattice in R jr is the collection of points jc = (;ci” •. x n ) with 
integer components The basic theorem of the subject says that this example is 
typical. 


Theorem 1 * Every lattice has an integer basis, that is^ a collcciion of vectors in 
L such that every vector in the lattice can be expressed uniquely as a linear 
combination of basis vectors with integer coefficients. 

Proof, The dimension of a lattice L is the dimension of the linear space it spans. 
Let L bit A-dimensional, and lei /m ， ，， .be a basis in L for the spaa of L: that is, 
every vector t in /- can be expressed uniquely as 

f = H 嘯 巧 reaL ⑴ 

Consider now ihe subset of those vectors t in L which are of form (I) with aj between 
0 and I : 


0<aj< l, j- (2) 
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This m is not empty, for its contains all vectors with ^ = Oor 1. Since Lis discrete* 
there are only a finite number of vectors / in A of this form; denote by q\ that vector r 
of form (1) ， (2) for which ay is positive and as small as possible. □ 

Exercise i. Show that ay is a rational number. 

Now replace p] by q\ in the basis; every vector f in i can be expressed uniquely as 

k 

^ — bi^\ + ^2 b 豹、 bj real. (3) 

2 

We claim that h\ occurring in (3) is an integer: for if not, we can subtract a suitable 
integer multiple of tf\ from i so thal the coefficient b\ of q\ lies strictly between 
Oand !: 


0 < /?i < 1. 


If then we substitute into (3) the representation (1) of w in terms of p!” .. .p k and 
add or subtract suitable integer multiples of p 2 ，“ 甲， Pi, we find that the p\ 
coefficient of / is positive and less than the p\ coefficient of q\ t This contradicts our 
choice of q\. 

We complete our proof by an induction on fc, the dimension of the 1 alt ice. 
Denote by U) the subsel ol L consisting of [hose vectors / in L whose represeniation 
of form (3), is zero. Clearly U) is a sublaitice of L of dimension k - 1; by 

induction hypothesis has an integer basis qj } _By (3 ), ⑴”…取 k an 

integer basis of L. □ 

An integer basis is far from unique as is shown in the following theorem* 

Theorem 2 + Let L be an ^-dimensional lattice in Let … 仏 t and 

ri ，… ，g be two inlcgcr bases of L: denote by Q and R the matrices whose columns 
arc q\ and n ， respectively* Then 


Q = MR ， 

where M is a unimodulur matrix, that is, a matrix with inleger entries whose 
determinant is plus or minus L 

Exercise %, (i) Prove Theorem 2, 

(ii) Show thal uni modular matrices form a group under multiplication. 

Definition, Let L be a lattice in a linear space X, The dual of U denoted as L\ is 
the subset of the dual X f of X consisting of those vectors f for which ( 以 ） is an 
integer for all r in L. 
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Thwirem 3. (I) The duiil ol un n-Ji ： mensi{>nul latlk'c jn un ^-dimensinnu] tincar 
space is an n-diinensional lattice. 

m C = L. 

Exercise Prov? Th^rem 3, 

Exercise 4. Show ihai L h diisertte if ami only if there is a poskive number d 
sut:h lhal the biitl of mdiiiK d L'cnlcncd at I he drigin contiuns nc ulhcr point of L. 
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Fast Matrix Multiplication 


How many seal lit multi pi icLLCiufis arc ntetkii Eo ft>rm Ehc producl C of iwt> n X n 
mairifcs A and B"? Sinoc each eriiiy of C is the pn-uiiiL'l of u now tif A wilh ii column 
of B. and since C entries, we need n J scalar multiplications, as well asu 3 tr 
uddiiiuns, U was si grcdC discovery ol l Volkcr Slraiisen ihat there is a way uf 
multiplying mulriccs ihnl lmts niuny fewer scalar muhiplicLilions ard additions. The 
crux of ihe idea lies in a clever way of multiplying 2x2 malrices; 

A= d 

\^21 议 11 J \^2t ^12 J 

Aa=C= (【■■〜）， 

V^i m) 

fii = filial) + at^hCij - anfiii + and w ini. Define 

I = {o \I +*aa)(frii + fraX 

II = («2I + 这 22)*11 ， 

III =<Jij(fc| 3 -*2 2 ) t 

IV = ^2(^21 -i||) T ⑴ 

V =(fl|| +tt\l)h22, 

VI — (a2i - ail) ( 办 ii H- fell), 

VII - (il|2 - NhK^JI 十 tf22)^ 
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A straightforward but tedious calculation shows that the enlries of the product matrix 
C can be expressed as follows: 


r H ^ I + IV - V + VII, ^ HI +V, 

c 2i = II + IV ? C22 = I + III — II + VL 


The point is that whereas the standard evaluation of the entries in the product matrix 
uses two multiplications per entry, ihcrel'orc a total of eight, the seven quaniilics in 
(1) need only seven multiplications. The total number of additions and subtractions 
needed in (1) and (2) is 18, 

The formulas (1) and (2) in no way use the commutativity of the quantities a and 
b. Therefore, (1) and (2) can be used to multiply 4x4 matrices by interpreting the 
entries a f j and % as 2 x 2 block entries. Proceeding recursively in this fashion, we 
can use (1) and (2) to multiply any two matrices A Lind B of order 2 k . 

How many scalar multiplications M(k) have to be carried out in Ihis scheme? In 
multiplying two square matrices of order 2 A we have to perform seven 
multiplicalions of blocks of size 2 是 2 表 1 . This lakes lM(k — 1) scalar 
multiplications- So 


M(k) = 7M(k_ l). 

Since jW( 0) — L we deduce that 

M(k) = 7 k ^ 2 k 7 = n k ^ 7 \ (3) 

where n — 2^ is the order of the matrices to be multiplied. 

Denote by A(k) the number of scalar additions - subtractions needed to multiply 
two matrices of order 2 A using Strasscn 1 ^ algorithm* Wc have to perform 18 additions 
and 7 multiplications of blocks of size 2 k ~ l x 2 k ~ ] \ the latter takes 1A[k I) 
additions, the former 18(2* 1 ) 2 = 9 - So altogether 

A(*) = 9 2 2 *" 1 + 7A(* - 1), 

Introduce i?(A) — 7—* 沁 （ A); ihcn the above recursion can be rewritten as 

s ( 人 ) = @)*+s(fc-i). 

Summing with respect to k we get, since fi(0) = 0 ， 


9 
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therefore 


A(k) f 6 x 7* = 6 x 2 ㈣ &7 = f>n^ 7 ⑷ 

Since log 2 7 = 2.807 … is less than 3, the number of scalar multiplicalions required 
in Strasscn^ algorithm is for n large, very much less than /j 3 the number of scalar 
multiplications required in the standard way of multiplying matrices. 

Matrices whose order is not a power of 2 can be turned into one by adjoining a 
suitable number of I % on the diagonal. 

Refinements of Strassen’s idea have led ro further reduction of the number of 
scalar multiplications needed to multiply two matrices. It has been conjectured that 
for any positive t there k an algorithm (hat computes the product of twt) n x n 
matrices using cost n 2+ * scalar multiplication, where the conltmL depends on e. 
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Gershgorin’s Theorem 


This icsuh can be used lo ^ivc very simple estimates on the uf ihc 

eigenvalues nf a m«tlrk T cnjtlc sir ^ccur^U' depending on (he circumstances. 


(kr^Jigi 

mmposi 


ann ClrcW Tlieurent. 

pose it us 


Let A bti an n x n matrix wilh cumpLex tntrits. 


A = D + F ， 


⑴ 


where D is rhe diagonal matrix equal to the diagonal of A: F has zero diagonal 
eiurits, Demjte by d t iht ；th entry of iuid by f t the /lh n>w of F. Dcltnc ihe 

clrcubr diwr Q to consifil of alE crnnipIcK numbers z Sialisfying 

|s-^l<l/ ； l, T ， …, n ， (2) 

TIte I -norni of n \ixiot f Ihc sum of the absolu(c vuJaes df \iv, components; stw 
Chapier I4„ CUim ： t^cr^ eigenvalue nf A i、fontained \n one of ihc di?ics C, r 

Pmof- Li：l u ht. L an eigertvtitCur of A t 

Au = Au, (3) 

normal iTed as |«|^ = I, where she so-nonn is the maximum of I he absolute vqtue of 
ihe carnponenls «j of u. Cl&arly, 卜 J f ! for j and ui = I fof ^ne i. Writing 
A = D + F in thu /lh enmponunt can be wrilten us +fiU ― A t which cun bu 
rewritten as 


A — ^ = fiu. 
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The iibwluic 1 value of ihc pr{jdjcl /ij is < |/| ( so 

|A-^|<W l M w = i^| 1 . □ 

Exercise. Shnw that if Q is disjoint from all the other nershgorin disc、ihcn Q 
cuniains exaclly one eigenvalue of A. 

In muny iteralivc mjcthud^ lor finding the cigcnvaHue^ of a miiirix A, A h 
transformed by a sequence of similorUy tninsformEiliiHis inlo Aj, m> Eh^t A t (ends Lo a 
diitgonul ilia mi Being simUar to A, each A* has the same dgemiitues as A. 
Geriih^orin s ihcoitm can be u!i£d lo ejitimaic how clobicly tht dijgoiml dement of 
A* approximate the dgcnvulueii uf A. 






APPENDIX 8 


The Multiplicity of Eigenvalues 


The set of n x n real, self-adjoint matrices forms a linear space of dimension 
N — n{n + 1)/2. We have seen at ihe end of Chapter 9 that the set of degenerate 
matrices, that is, ones with multiple eigenvalues, form a surface of codimension 2 t 
that is, of dimension iV — 2 - This explains ihe phenomenon of “avoided crossing,” 
that is, in general, self-adjoint matrices in a one-parameter family have all distinct 
eigenvalues. By the same token a two-parameter family of self-adjoint matrices 
ought to have a good chance of containing a matrix with a muliiple eigenvalue. In 
this appendix we state and prove such a theorem about two parameter families of' the 
following form: 


aA — ftB + cC ? a 2 + b 2 + c 2 = L (1) 

Here A, B, C are real, self-adjoinl n x n matrices, and a, h, c are real numbers. 

Theorem <Lax), If n = 2(inod4), then there exist a,b,c such lhat (1) is 
degenerate, that is, has a mu](iplc eigenvalue. 

Proof Dcnolc by A r the set ofall nondegenerate malrices* For any N in A’ denote 
by ki < ki < … < k„ the eigenvalues of N arranged in increasing order and by uj 
the corresponding normalized eigenvectors: 

Niiy — kjMj ， II 11 = ~ I,**.,/?* (2) 

Note that each uj \s determined only up to a factor ±K 

Lei 0 <; < 2 tt, be a closed curve in M, If we fix «y(0) ? (hen the normalized 
eigenvector Uj(i) can be determined uniquely as continuous functions of Since for 
a closed curve N(2 tt) — N(0), 

巧 (2?r) = n«(0), tj — ±U (3) 
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APPENDIX 8 ： THR MULTIPLICITY OF RIGFNVAUiFS 


The quatuitieK tj, j = L * *. * m are functionals of the curve N(^), Clearly: 

(i) Each tj is invariant under homotopy, that is f continuous dcformaiion in A,* 

(ii) For a constant curve, that is, N(r) independent of 1, each Tj = l. 

N(f) =cos/A + sin/B. 0 < f < 2w (4) 

is a closed curve in A r - Note that N is periodic, and 

N(> + tt) = 一 N(f), 

It follows that 

^j(t + 7i) = — 

and that 

+ 霄）二 [ (f), (5) 

where pj = ±L Since nj is a continuous function of so is pj ； but since pj can only 
take on discrete values, It is independent of t. 

For each value of i, the eigenvectors tt] (/)， .… u fS (t) form an ordered basis. Since 
they change continuously they reiain iheir orientation. Thus the two ordered bases 

wi(0) ％ (0) and (tt) ， • ， _ ， %(tt) (6 ) 

have the same orientation. By (5)* 

Wi(7r), … 為 ㈤ =p|i/„(0)”" ， p"E“(0). ⑹ , 

Reversing the order of a basis for n even is the same as n/2 transpositions. Since 
each imnsposition reverses orientation, for /; = 2(mod4) we have an odd number of 
transpositions. So in order for (6) and (6)’ to have the same orientation, 

0 巧 = - 1 . 

1 

Writing this product as 

n/2 

j I PjPn-j^-l 一 - I 


we conclude there is an index k lor which 


PkPn^i —- 


⑺ 



J. ji^e Funclkmti nml Simple EationAl Cones 




PROBLEMS. 

V\ C'tu>fk I lull a simple 1 ratiotud t\me tit ift rf is n diuiefl ctmvc-x rout 1 without 
Htr ㈣ lit lincK. 

2°. t>'t h - {^ ^ v^2} C I'hrti 1 (lint K is tM h …叫心 

ntlioitid 

T. Lm A - |0 ,+qc) G K 1 . Chiock that 


/<D = V j ,m - 

m-n 



I ' Jf' 


fo-r Jill !■ e C siw li tluit jj"| < 1. 

4°. Urt 

K - R ( ； - e R" 6 >0 for Ml I - 1.rrj. 

Pruvn_ “ml. 

/(A:r}= E <' ^-UrTT ； 

li«i. t*d 1=3 

f^r ^11 {j'i, .,. ^ e C Hitcli i tiMl [x T | < 1 for i - I. d. 

S a . L« i t u e Z d he an impgcr wvtur mid let l- T = |k £ C d : | < l J. i^nivc 

that 
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VHL Lattice Poll 似 mid Poiyhi dru 


Proof. Tlie pruof that Ueiikih Vll.^J, For a real mirtib^r tet [Cj 

tx i E hv integer \mrl uf C {thy Inq^t niUM?tf tnn and ]ft {(}-(- [(j 

tin 1 ' frtdkymil pnrt of ^ 

Ijct 11a choacif n point m e K' n E J , sn 

k 

rn = y^fi 3 m, wherp a, > 0 fe>r i = I, r . - ,*i. 


and ini — E [… 卜， 



PROBLEMS. 

i". Let C be a aimplr ratiuiml 助 in 】-[wmi L 2 stud k*t 

If 

fi = {y^nyw,. : 0 < rtj < I fur i = 1./f|. 

1 拍 1 

Lfi itir K tlrm>tHhc titu riiir of i\ wiisittf.rwt nmvix M'l in Uk aftiiic bulE- Prtwt] 

thrtt 

f(mtK.x)= f K ，， )nrr ^： 

ifj^int iWnZ ,F ^mEnnSE.'" . ㈣ 又 


prm idwl 1x u, | < L for i - I …上 
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and from the last equaiiori that 1 = This shows that the eigetivalues X are the tt 
roots of unity Xj — cxp(~j), and the corresponding eigenvectors 

(I) 

Each eigenveclor has norm \\ej\\ = y/n. 

Every vector u ― (u\, *, h„) can be expressed as a linear combination of 
eigenvectors: 

n 

u ~ „ ⑵ 
圍 

Using the explicit expression (1) of the eigenvector, we can rewrite (2) as 



Using ihc orthogonality of llic eigenvectors, along with Ihcir norm ||^|| — wc 
can express the coefliciems aj as 

I Jf / 2 tj-| \ 

A = (II， ej)/n = - exp (— —jkj . (3) 

It is instructive to compare formulas (2) y and (3) to the expansion of periodic 
functions in terms of their Fourier series. But first we have to rewrite equation (2) y as 
follows. Suppose n is odd; we rewrite the last (/i - 1)/2 terms in the sum (2)’ by 
introducing a new index of summation l related to j by j = n — L Then 




(2m f t \ { 2m ' 

— exp —— [n — I) i = expl - Ik 


Selling this into (2)\ we can rewrite it as 


{n ^ /2 [2ni ^ 

uk ~ 2^ a i ex Pi —J k 

v n 


(4) 


where we have reverted to denote the index of summation by j and where aj is 
defined by (3), A similar formula holds when n is even. 

Let w(x) be a periodic function of period 1 * Its Fourier series representation is 


fi(jr) = ^ hjC\p{2nijx)^ 


( 5 ) 
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where the Fourier coefficients are given by the formula 




f/{.r) c\p(—27rijx)dx. 



Setting x = kjn into (5) gives 


u 


(l)=H h J cx p 


2jt/ 


n 


jk 



The integral (6) can be approximated by a finite sum at the equidistant points 

Xk — k/n: 





Suppose h is a smooth periodic function, say d times differentiable. Then the 
{n - 1)/2 section of the Fourier series (5) is an excellent approximation to 

“(x) = hjQxp(2nijx) + 0(n~ d ) (7) 

-{n-l)/2 

Similarly, the approximating sum on Ihe right-hand side of (6) f differs by 0(n (l ) 
from b tt . It follows that if in (3) we take aj differs from by 

When u is a smooth periodic function, its derivatives may be calculated by 
differcnliating its Fourier scries term hy term: 

ff^u(x) = bj(2mj) m exp( 2 Jn». 

The tniocated series is an excellent approximation to It follows therefore that 


(n-D/2 

Y1 ^(2^//) w exp 

is an excellent approximation to d^u (|), provided that in (3) is taken as 
Therein lies the utility of the finite Fourier transform: II can be used to obtiiin highly 
accurate approximations to derivatives of smooth periodic functions, which can then 
be used to construct very accurate approximate solutions of differential equations. 

On the other hand, operations such as multiplication of u by a given smooth 
function can be carried out fast and accurately when u is represented by its values at 
the points k/n. Since in the course of a calculation the operation of differentiation 




PROBLEMS. 

1 . Ix 1 ! P C hr h rrttiuiml iHUvlH'cirri-ii ami Irt f' be a vt'rlcx (if P. Fn 
iliat r has rntioiml (iMmrlinatcH. 


川 Hf: tT. Htt^ireiii II. 1-2, 
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There are some additional computational expenses; additions and rearrangements 
of vectors. The total amoum of work is 5n log 2 n flops (floating point operations). 

The inverse operation, expressing in terms of the aj [see (2/], is the same, 
except that a) is replticcd by and there is no divisioo by n. 

There is an interesting discussion of the history of rhe Fast Fourier Transform in 
the 1968 Arden House Workshop, 

When Cooley and Tukey s s paptr on the Fast Fourier Transform appeared, 
Mathematical Reviews reviewed i( by tide only; the editors did noi grasp its 
importance. 
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APPENDIX 10 


The Spectral Radius 


Let X be a fioite-diinensiooal Euclidean space, and let A be a linear mapping of X 
into X, Denote by r{A) the spectral radius of A: 

r(A) =nm|4 (1) 

t 

where a { rangers over all eigenvalues of A. We claim that 

Hm ||A1|^ = r(A), (2) 

尸 oo 

where ||A^|| denotes ihe norm of the j\h power of A* 

Proof, k straightforward estimate [see inequality (48)) of Chapter 7] shows that 

Ii^l| l/； >r(A). (3) 

We shall show that 

lim sup||A ; || lj ^ < r(A), (4) 

Combining ⑶ and (4) gives (2), 

Wc can introduce an orthonormal basis in X, thereby turning X into €J\ with the 
standard Euclidean norm (|.va 广 + … + | 不 ,| ) ' ， and A into an n x n matrix with 
complex entries. We slart with the Schur factorization o( A: □ 

Theorem 1. Every square matrix A with complex entries can be factored as 

A = QTQ% (5) 
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where Q is unitary and T is upper triangular: 

ty — 0 for i > / 

(5) is called a Schor factorization of A. 

Proof, If A is a normal mairix, then according to Theorem 8 ofChapier 8 it has a 
complete set of orthogonal eigenvectors … ， q IU with eigenvalues … ，馬： 

= a k q k . (6) 

Choose the qk to have norm I, and define the matrix Q as 


Q = (<?i i ， " ，必 I:- 

Since the columns of Q are pairwise orthogonal unit veciors, Q is yniiary. Equations 
(6) can be expressed in terms of Q as 

AQ 二 QD. (6 ) ； 

where D is the diagonal mairix with Multiplying (6) ; by Q 1 *' on the right 

gives a Schur I'aclorizatiori of A ¥ with T 1 = D, 

For arbitrary A we argue inductively on the order n of A, 

Wc have shown at the beginning of Chapter 6 that every n x n matrix A has at 
least one eigenvalue a, possibly complex: 


A 穿 =a r (7) 

Choose q to have norm L and complete it to an orthonormal basis q \ ，…，屮卜 q\ = q, 
and define the matrix U to have columns f/i ， ,q n : 

U - (i7i，."，％). 

Clearly, U is a unitary matrix, and the first column of AU is aq: 

AU = {iUf, C2, … ， c n ). (i f 

The adjoint IJ* of U has the form 



VVff. Lattky Poiiitm and Polyh^ra 



Fixture 97, The reader who the pc^itive direction bk upvaxd 

m&y fiwd to view this pdctujp- upaldt down- 


Wf? obtain /(P,x) hy dilfnrentiatiiig /(^, {x.ii+i)) wilh respect u> acid 
substituting Xd+i = 0 into the derivotive, 

Indf^d, we observe (hat for every lattice point € K the last coordinate 

^ ^ Dol3-b«gative. By a standard result in complex anaJysis, we can djlTeTonti*te 
the series aad uoncliKk that ihe series 


(3 丄 ” E «卜 E it" 1 + E wd 

m^PnZ'* (m a 

t^mvergea absolutely ami miifcirmly on conip&ct set^t in t/i to ft rations] function 


3 jtj+i 


/(A%{x, 2 J+1 )). 


Let 1/ C C rf be the proj<!ctifJn of U\ : z^+i) ■> ― * St ‘ Substituting ij+j = 0 in 

(3.1.1), wp conclude that fw every x e £/ the series 


converges ab$olutply end ： uniformly on compact subeetA of Ui to the ratioimt func¬ 
tion 

f(F, x} = 匕 〆 


□ 



: j. Gf^nerAting Functions and R^tionid Poiyhr^frfl 




PROBLEMS. 

1*. Ill th^ sit 仙 tkwi «f Ih rn e Im* a lattice vector and t-el P + xn 

be the translation of P. Priwr th&J 

f[P + m, k ) = x™ f(f\x.) r 


Kere is a ccntinuutib version uf Lrinma 31- 

2 - Let P C S' 1 be a polyhedron withcnit Jitraifthl linfs. Provp that tJwT« tsKi^s 
a non-empty open s^l U C C J And » rntioiia] function ^ : C 1 ^ —— * C such that fnr 
aJI c € t r hfly^ 

J psp{(c,i)} <ir = 

and the integral converges ahstahitely. let (n) = (a.x) + i^, j) for fl 

complex vector c = a + ih. 

Hint: Use Problem 2 of Swibn 2-4 ftjwi Ifk'k of Lnmma 3.L histcauj i)f 
diiFereiitlatin]；, use thr Laplar^ (ransfnrin, 

3' Let vi, r .. P ti n € Z d be v^Tctora such that the con^ K = co(ni ，.- . , u n ) dors 
not contain atrAi^ht lines. Let 


n 

= {!： 


q^Ui where ai，... ,a n Bre noii-nsgati^ ink 


be th 咛 gimerated hy yi，... 4 ti „： cf. Problem G of Section 1.2. l 3 Tavf 

that there esiaU tt tjp<；ti ；«^t U C C" 1 tlL^t for all x e l- thf series 

x，n couverRes ab»olute]y and (iniforraiy on tompacl subs^tj! of (7 toa rAtii^ml 
functkiD in x. 

Hint: Let R+ be the rion-EieipUlv? vtthaiit ill R' Oomstnift a liiif&r transfetr- 
matiDn T : R" — » R d Mich that r(Z™) = S- Construct a Q C It 1 ； whifh is u 
finite union of ratioaaJ polyherlra and such that I lie restriction T : Q n —* S is 
a biji5ction. Apply Lriumn 3>.I. 


arc gating ready to prove the central result of this diapt^r. We HtHlO it 
ift the fMtii cif iiic exifitonce for some pnrtirular vahaatioiu cf Swti ™；： 

1.7 and t- 8 ‘. We nwd I hut ftttitwinJ of the s^gfhra of polytiedraL ace 

Definition 1 . 9 . 3 , 

(3 、 2) Tlw rf^l wfloe space spo.nnf'd by th^ indicator fimrtionii ]F] 

of rfttmnnl polylwtka r C i» called tbp itlgebm of rational potyhedra in an<3 
denoted P^j. Let €( 3 -^ ,., ■ tlw c ⑽屮 |pk v«tra sparp of all rational 

functions in d %-ariiibles. 
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when; c h some positive number Using Ehe triangle inequality, we conclude that 

HT c l| = ||D A -fS f ||<r(A) + c€, (15) 


It follows from (14) that 


T = D,T t D ； 


Sel this in the Schur factorization (5) of A: 


Denote QD f by M. Since Q is unitary ， Q* — Q’、and (16) can be rewritlen as 


(16) 


It follows that 


A = MT f M 


A j = MT^M 


(I6) J 


Using the inulliplicative inequality for the matrix norm, wc obtain the inequality 


||A>||<[|M|||!M^[|l|T f i| / 


Taking the jth root, we get 


where m = ||M|| ||M _||, Using the estimate (15) on the right gives 

llA^II 1 '^ < "r l/J (r(A) + ce). 

Now let j lend to oo; we get that 

limsup||A^|| l/7 < r(A) + ce. 

Since this holds for all positive e < we have 

limsup||A ; |I l,J < r(A), 

as asserted in (4). As noted there, (4) and (3) simply (2) - □ 

Exercise 2, Show that the Euclidean norm of a diagonal matrix is the 
maximum of the absolute value of its eigenvalues. 
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Note L The crucial step in ihe argument is to show that every matrix is similar (o 
an upper triangular matrix. Wc could have appealed to the result in Chapter 6 Lh【it 
every matrix is similar to a matrix in Jordan form, since matrices in Jordan form ure 
upper iriangular. Since the proof of the Jordan form is delicate, wc preferred to base 
the argument on Schur factorization, whose proof is more robust* 


Exercise 3 . Prove the analogue of relation (2 )， 


lim : r(A )， (17) 

j—^00 

when A is a linear mapping of any tinite-dimensional normed^ linear space X (see 
Chapters 14 and 15)* 


Note 2. Relation (17) holds for mappings in infinite-dimensional spaces as well. 
The proof given above relies heavily on the spectral theory of linear mappings in 
finite-dimensional spaces, which has no infinite-dimensional analogue. We shall 
therefore sketch another approach to relation (17) that has a straightforward 
extension to infinite-dimensional normed linear spaces. This approach is based on 
the notion of matrix-valued analytic functions. 


Definition L Let z = x + iy be a complex variable, A(^), an n x n malrix- 
valued function of z. A(i) ^ an analytic function of zinn domain G of ihe z plane if 
all entries a ❼ (z) of A(z) are analytic functions of z in G* 


Defimtmn 2, X is a finite-dimensional normed linear space, and A(z) is a family 
of linear mappings of X into X, depending on The complex parameter z. A(z) depends 
analytically on 4 in a domain G if ihe limit 


lim A(z + / "" A(r) 
h 


= a '⑴ 


exists in the sense of convergence defined in equation (16) of Chapter 15* 


Exercise 4 . Show that the two definitions are equivalent. 

Exercise 5 * Let A ⑵ be an analytic matrix function in a domain G, inveiliblc at 
every point of G. Show that then A -1 too, is un mialytic matrix function in G, 

ExFRriSH 6, Show that the Cauchy Integral theorem holds for matrix-valued 
funertions. 

The analytic funciions we shall be dealing with are resolvents. The resolvent of A 
is defined as 


R ㈤ =(d — A) 


( 18 ) 
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for all z not an eigenvalue of A, h follows from Exercise 5 that R(z) h an analytic 
function, 

Theorem 2. For \z\ > [A| R[z) has the expansion 

A rr 

R ( 和 E w ㈣ 

i) ^ 

Proof. By the nmUiplicative estimate \A Il \ < |A|' it follows that the series on ihe 
right-hand (19) converges for jz| > |A|. 

Multiply (19) by (zl - A); term-by-lerm muliiplication gives 1 on the right-hand 
side. This proves thiit (19) is (lie inverse of (zl - A)* □ 

Multiply (19) by and integrate it over any circle |z| ^ s > |A|. On ihe righl- 
hand side we integrate term by term; only the jth integral is 一 0, so we get 

j R (扣冶 = (20) 

IsJ : 本 

Since R(z) is an analytic fund ion outside Ihe spectrum of A t we can, according to 
Exercise 6, deform the circle of intcgratioti to any circle of radius 

s = r(A) + e, e > 0; 

J R(z)^ dz = 2^riA^ (20)’ 

kM A )+ e 

To estimate the norm of A) from its integral representation (20/, we rewrite the dz 
integration in terms of (W integration, where $ is the polar angle, z = se i& and 
dz = sie 10 dO ： 

= ; / R(iV+V 研”， (21) 

2n J 
o 

The norm of an integral of linear maps is bounded by the maximum of the integrand 
times the length of the interval of integration. Since R(z) is an analytic function, it is 
continuous on the circle \z\ =r(A) + e, e > 0; denote the maximum of JR(z)| on 
this circle by c(c). We can then estimaie the norm of from its integral 
representation (21), with s — r(A) +€, as follows: 


W\<(r(A) + ,f l c(€). 
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Take ihe ;th root: 

m l/J <m(e) i/j (r(A) + e), (22) 

where m(e) = (r(A) + e) c(e). Let j tend to oo in (22); we get 

lim sup |A J |】’）< i"(A ) 十 e. 

j—^oc 

Since this holds for all positive 6, no matter how small, 

lim sup lA 7 ! 1 ^ < f{A), 

On the other hand, analogously to (3), 

|A^|> r(AY 

for any norm. Taking the ;ih gives 

iA^| W >r{A). 

Combining this with (21), we deduce (17)* 口 

This proof nowhere uses the finite dimensionality of the normed linear space on 
which A acts. 
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The Lorentz Group 


K lo classical mechiinics，particles and bodies arc located in absolute, motionless 
space equipped wiih Euclidean structure. Motion of particles is described by giving 
their position in absolute space as a function of an absolute time* 

In rhe relalivistic description, there is no absolute space and time, because space 
and rime are inseparable. The speed of light is the same in iwo coordinate systems 
moving with constant vclocily with respect to each other. This can be expressed by 
saying that the Minkowski metric / 2 - x 2 — y 2 — z 2 is the same in both coordinate 
KysLems — here we have taken the speed of light to have the numerical value L 
A linear transformation of four-dimensional space-time that preserves ihe 
quudradc form t 2 — x 2 — y 2 — z 1 is called a Lorentz transformation. In this chapter 
we shall investigate their properties. 

We start wiih the slightly simpler (2 4 - 1 )-dimensional space-lime. Denote by a 
the space-time vector and denote by M the matrix 

/« 0 0 \ 

0 -1 0 L (l) 

\0 0 -l / 

Clearly, 

^2 - x 2 - y 2 = (w. Mil), (2) 

where {,) denotes the standard scalar product in R' 

The condition that the Lorentz transformation L preserve the quadratic form (2) is 
that for all u, 

(Lw, MLii) — (h, Mh) (3) 
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rf+l — 

m=[u.= e 

I -t /Cpp£/ 

1^9 


Lh 

卜 n( 

■ef 

If / = {I.J + !). wr haw ^ = A. If / = { 1 ,.-, e /+ 1 } \ {i}, w havr 1 

P := f,). All ulhn polylvctlra P{ ruhtuin .slrai^ht linfs. [n purtirtibtr if 

i,j^ /, tlwu Pjr a strnighi lina in th^ fliiwtioii of c, - e. r □ 


PROBLEMS, 

■ °. Lr-l coiic(P L F) hf ihe stupporl conf of F at a face F C Pi JStf Prank'lli 3 t>f 
Socliuii 4,1. Let us- fix a 0 < t < d. Pruvc that fur t\iv sihihi^rd siinp]cx A 

]A}- E (-l) dira ^[^io(A.n] 

F h ： ft fer ■” if i 
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If L maps (K 0* 0/ into the blc, we argue analogously; this completes the proof of 
Theorem K □ 

Definition, It follows Irom (4) ; that dctL = 土 I. The Lorentz trunsformations 
that map the flc onto itself, and for which det L = 1, form the proper Lorentz f^roi/p. 

Theorem 2, Suppose L belongs to the proper Lorentz group, and maps the poinl 
e — (I.0 T 0) onio iiself. Then L is rotation around the t axi 、 

Proof. Le = e implies that the first column of L is (1, 0, 0) f . According to (4)* 
L f ML = M; since Me — Ve = e; therefore the first column of is (1 ， 0, So 
the first row of L is ( 】， 0, 0>. Thus L has the form 

1 0 0 
0 

0 R 

Since L preserves the Minkowski metric, M is an isoTnctry K Since detL — l t 
det M — 1; so R is a rotalion. □ 




Exercise 2. Show that Lorentz transformations preserve solutions of the wave 
equation. That is ， if f{t 7 x f y) satisfies 

fit - } xx ~ fyy ~ 

then/(L(r^,y)) satisfies the same equation. 

Next we shall present an explicit description of proper Lorentz transformations. 
Given any point u — (t,x,y)\ wc represent it by the 2x2 symmetric matrix U: 



Clearly, U is real and symmetric and 

detU =r-.T 2 -j 2 ! trU = 2r. 


(5) 


( 6 ) 


Let W be any 2x2 real matrix whose determinant equals L Define the 2 x 2 matrix 
V by 

WUW y = V. (7) 

Clearly，V is real and symmetric and 

detV= (detW)(detU)(detW) =detU, (8) 
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since we have assumed that det W = L Denote the entries of V as 

〜夂) . ⑼ 

Given W, (5), (7), and (9) defines a linear transformation (i f x,y) — (〆 ， / ， 〆), It 

follows from (6) and (8) that f 2 = ~ j 2 = ^ / 2 — y /2 m That is. each W 

generates a Lorentz transformation* We denote it by L w . Clearly, W and -W 
genenite the same Lorentz transt'nrination. Conversely, 

Exercise 3 , Show that if W and Z generate the same Lorentz transformation, 
then Z 二 W orZ= - W. 

The 2x2 matrices W with real entries and dererminant ] form a group under 
matrix nioliiplication, called the special linear group of order 2 over the reals. This 
group is denoted as SL(2, R), 

Exrrcise 4, Show that SL{2, M) is connected — that is, that every W in SU2. R) 
can be deformed cominuously within SL(2 t R) imo I. 


Formulas (5), (7), and ⑼ define a two-to-onc mapping of SL(2 t R) into the 
(2 4 - 1 )-dimcnsional Lorentz group. This mapping is a homomorphism, that is, 

Lwz — L\vLz. (10) 


Exekcise 5 . Verily (10). 

Theorem 3. (a) For W in SL(2, K), belongs to the proper Lorentz group* 

(b) Given any iwo points u and v in the Ik, salisfying (it. Mu) — (v, Mf), there is 
a Y in SL(2, 陶 such that L r ;； = 

(c) If Z is a rotation, L z is a rotation around the t axis. 

Proof, (a) A symmetric matrix U representing a point w — (f’x，〆 in the fle is 
posilive，and the converse is also true. For according to (6), det U = t 1 - x 2 - y 1 , 
trU — 2u and the positivity of bolh is cquivalenl to the positivity of the symmetric 
matrix U, 

By definitions (7) and (9), the matrix V representing — L^u is V — WUW^ 
clearly, if U is a positive symmetric matrix, so is V. This shows that L 货 maps the fle 
into the fle. 

According to (4)〜ihe detemiimmt of the Lorentz transformalion L w is I or — 1. 
When W is the identity I, L w is the identity, and so del L ( — 1, During a continuous 
deformation of W, der L w changes continuously, so it doesn’t change at all. 
Therefore, del Lw — I for oJl W that can be delonned cominuuusly ink) L According 
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to Exercise 4, all W can be deformed into I; ihis shows that for all W, L w is a proper 
Lorentz transformation. 

(b) We shall show that given any v in the flc» there is a W in SL{2, IR) such that 
Lw maps = f( i ， 0,0〆 into where f is ihc positive square root of (v, Mv). The 
matrix representing e is I; the matrix representing L\ K re is WrIW y = AVW 之 So we 
have to choose W so that /WW' = V， where V represents \\ Since /" = (v T Mv)= 
det V and since by (a) V is positive, we can satisfy the equation for W by selling W as 
the positive square rooi of t 1 V, 

Similarly, for any other points in the fle for which {«； Mtt) — (w Mv)，there isa Z 
in SL(2, _)，for which — Then LwL^ 1 maps u into v. Since W —^ Lw is a 
homomorphism, LwL^ 1 = L^ v 4. 

(c) Suppose that Z is a rotation in K 2 ; then llL — L Using the commutativity of 
trace, wc get from V — TMl! that 

tr V = trZUZ' - trUZ'Z - trU 

for all U, For U of form (5) and Vof form (9), tr U = 2f, tr V = 2〆， so r = / for all U + 
Since t 2 — x 2 — y 2 = f 2 —— y 2 , it follows that L z maps (f ， 0, 0) into itself. We 
appeal to Tlieorem 2 to conclude lhat L z is rotation around the t axis. □ 

Exercise Show thal if Z is rotation by angle S, L z is rotation by angle 29 . 

Theorem 4. Every proper Lorenlz transformalion L is of the form L Y , Y in 
SL(2, R), 

Proof, Denote by u the image of e = {L0.0) under L: 

L# = (/. 

Since e lies in the lie. so does it. According lo part (b) of Theorem 3, Lw^ = n for 
some W in SL(2, M) + Therefore L^'Le — e; according to Theorem 2, L^L is 
rotation around the / axis. By part (c) of Theorem 3, along with Exercise 6 T there 
m a rotation Z in SL(2, R) such lhal L W 'L — Lz ； it follows thal 
L = LwLz = Lwz* □ 

EXERCISE 7* Show that a symmetric 2x2 matrix is positive iff its trace and 
dctcmiiriaiit arc both positive. 

Exercise 8. (a) Let Lis) be a one-parameter family of Lorcntz transform ations 

that depends differentiably on s r Show that L(.v) satisfies a differential equation of the 
form 


where A(s) is anti-self-adjoint. 


9jL = AML, 


(II) 
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Since (u,p) — 0, 


st = 一 (ox 十 by) 

Applying the Schwarz inequality on the right gives the opposite of itie previous 
inequality, a contradictions. 

Given two distinct points u and v in HI, there is a unique p, except for constant 
multiple, that satisfies (u : p) — 0. (v,p) — 0, According to what we have shown 
above, p satisfies (13) when u or v ? belongs to H*. 

(c) Take the line consisting of all points u in IH that satisfy (u,p) — 0, where p is 
a given vector, (/?, Mp) < 0. Let L be a proper Loren IE transformation; the inverse 
image of the line under L consists of all points v such that u — Lv, These points r 
satisfy (Lv\/j) = (y, hp) — 0. We claim that the points v lie on H, and that q — Lp 
satisfies (q. Mq) < 0. Bolh of these assertions follow from the properties of proper 
Lorentz iransformations. □ 

Next we verify that our geometry in non-Eudidean. Take all lines (u,p) — 0 
through the point u = (L0.0). Clearly, such a p is of the form p = (0.^, b), and ihe 
points u = on the line (u.p) = 0 satisfy 

ax + by — 0. (14) 

Take q — (L L I); points a ^ (f, x, y) on the line { M〆/) 一 0 satisfy t + x + y ^ 0. 
For such a poinl u, 

(u 3 Mu) — r — .ic 2 - > ,2 — (x + y) 2 - jt 2 — y 2 — Zk\k (15) 

The points h on the intersection of the two lines satisfy bolh — 0 

urui (u, g) = 0- If a and h tire of the same sign, it follows from (14) that x and y 
are of ihe opposite sign; il follows from (15) (hat such a u does not lie in ihe 
fle. 


Thus there iire infinitely many lines through ihe point (l A), 0) that do not intersect 
the line t H- a' + j = 0 in H; this violates Ihe parallel postulate of Euclidean 
geometry. 

In our geometry, the proper Lorentz transformation are ihe analogues of 
Euclidean mol ion translLilions combined with rotaiions. Bolh objects form u ihree- 
parameter family. 

We turn now to the definition of distance in our geometry. Take two nearby 
points in IH, denoted as and (f + + t/x，j 十 ify). Their image under 

a proper Lorentz transformation L is (〆”〆，/) and (〆 + + dV+ dj/). 

Since L is linear, (dt\ ,d\f) is the iinEige of (dt, d\\ dy) under L» and 
therefore 


dt ，2 — dx 2 — dy a = dt 1 d\ 2 dy 2 ’ 
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an invariant quadratic form. Since for points of H, / 2 -x 2 — y 2 ~ I ? we have 
dt — j dx + ydy. So wc choose as invarianl metric 

以 + 以_出2 = + 舻一字"曲 ⑽ 

f- T~ i 卜 

Once we have a metric, we can define the angle between lilies ai Iheir point of 
intersection using the metric (16), 

3, In this section we shall briefly outline ihe theory of Loremz transformations in 
(3 + 1)-dimensional space-time. The details of the results and their proofs arc 
analogous lo those described in Seclion ]. 

The Minkowski metric in 3 + 1 dimensions is t 2 .r 2 y 2 z 2 , and a Lorentz 
traiisformalion is a linear map that preserves this metric* The Minkowski metric can 
be expressed, analogously with (2), as 

(17) 

where M is the 4 x 4 diagonal matrix whose diagonal entries are 1, -1, -1, and - 1 T 
and u denotes a point (t^x^y^zY in (3 + 1)-dimensional space-time. The forward 
light cone is delined, as before, as the set of points u for which (u, Mh) > 0 aod 
t > 0. 

A Lorentz iransformation is represented by a matrix L that satisfies ihe four- 
dimensional analogue of equation (4): 

l；ML = M. (18) 

A proper Lorentz iranslbrmation is one thal maps ihe lie onto itself and 
whose determinant det L equals L The proper Lorentz transformations form a 
group. 

Just as in the (2 + I)-dimensioiial case, proper Lorentz transformations in 3 + I 
space can be described explicitly* We start by represeiuing vectors u = in 

3 +1 space by complex-valued self-adjoint 2x2 matrices 

a-(in y t t iz ) (19) 

\y-n t + x J 

The Minkowski metric of u can he expressed as 

r 2 —jc 2 —/ —z 2 = detU, (20) 

Let W be any complex-valued 2x2 matrix of determinant t. Define the 2 x 2 
matrix V by 


V - wuw\ 


(2i) 
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where W* is the adjoint of W, and U is defined by (19)* Clearly, Vis self-adjoint, so it 
cun be written as 


(t r ~ x j y + ( v\ 

)■ 


( 22 ) 


given W, (19) ， (21), and (22) define a linear map —►(〆 ”〆， Take the 

determinant of (21): 


dciV = (dctW](tici U)(dctW # ), 
Using (20), (20)’，and detW = U it follows that 

2 2 7 2 Jl P Fl fl 

r — T - y - z — t~ — x — y — z , 


This shows Lhai each W generates a Lorent/ transformalion* We denote il as L w . 

The complex-valued 2x2 matrices of determinant 1 form a ^roup denoted as 
SU2, C). 

Theorem 6, (a) For every W in SL(2, €), L w defined above is a proper Lorentz 

transformation, 

(b) The mapping W ^ L w Is a homomorphic map of SL(2, C) onto the proper 
Lorentz group* This mapping is 2 to 1. 

We leave it to the reader to prove this Lheorem using the techniques developed in 
Section 1, 

4. In this section we shall establish a relation between the group SU(2 t €) of 2 x 2 
unitary matrices and SO(3, R), the group of rotations in R 3 , 

We represent a point (x } >% z) f of U 3 by the 2 x 2 matrix 

)=U. 

Clearly, 

-del U — x 2 + j 2 + z 2 . 

The matrices U arc 2 x 2 self-adjoint matrices, trace of U — 0. 

Theorem 7. Z is a 2 x 2 unitary matrix of determinaru K and U is as above. 
Then 



(23) 


(24) 


V = ZUZ' 


(25) 




OaHtTting t he tmiih v 


palyiLOfiauM in fc. bqiiatiHg 


\kPnT 1 ： - y v 





Prirtr' tJifit for any iiositiw inregipr k we Imvc 
n(-k) - 

where tlw 1 tittc^rior of a pdyuvpe Ls wrt»itlvred with rrept-cl U> il» EiHim. hull 
rectproci^ rWttticwiL 

Wiu (: Uw Pruljttm I of Section 4T and Problem 1 of 2 4. 
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Compactness of the Unit Ball 


In this Appendix we shall present examples of Euclidean spaces X whose unit ball is 
compact — that is f where every sequence {x^} of vectors in X, || Xk || < K has a 
convergent subsequence. According to Theorem 17 of Chapter 7, such spaces arc 
finite dimensional. Thus compactness of the unit ball h an important criterion for 
finite dimensionality. 

Let G be a bounded domain in the y plane whose boundary is smooth* Let 
ii(x,y) be a twice differenliable function that salisfies in G the partial differential 
equation 


an + Ah = 0 ? 

where a m sl positive constant and A is ihe Laplace operator: 

Ail = U XJ£ + Uyy ； 

here subscripts x, y denote partial derivatives with respect lo these variables. 

Denote by S the set of solutions of equation (1) which in addition arc zero on the 
boundary of G: 

u(x,y) = 0 for (Xjy) in 3G, (3) 

Clearly, Ehe set S of such solutions form a linear space. 

We define for tt in S the norm 



INI 2 - 


J ir{x,y)dxdy. 

G 
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6. Example: Totally UniinoduLar Poly topes 

rn one special cme, Brion's Theorem (CoTollarv 4.^) gives a particularljr $uCrind 
reprcucntation of thr- functiciii. 

(6 + l) DefliiitionH, Lf»l U|,., . , e 2 d Ih? linearly indopoiirfent vectors anrf let 
K 了 co(rxi — , u k } h Lei — span(ti]_.. r ua. ) euk! let Ajt = Z d n . We nay 
that K i« a fitme .., ■ is a basis of A*, considered as a 

tattiix* in W> call u\ . uit grnfraior^ of K. 

Let F C be an jiitespr polvlope. We say rhat P is totally unimodu^r 
provided I lie aupport tone at every vertex of F is a traiud&tion of a euiimtKiular 
coue. 

itttfHirlAMt nn i totflUy unimodulur. 

PROBLEMS, 

i, Lrt hr thf ^tiinriard (d - l).dimciuiunAl simplex in R d ： 
rf 

A — = 1 and £, > 0 for i — l t .., 

Prim thrit A is ji hj-tfJly ij,i;iii]uilii]ar liutyloper 

2 - Lflf iifi fist pcjsHivi? 竹 r and n Hiiii let us identify R J h d ^ mn, with 

tlie s|jart' t)f ttt m n real xnfllritira Thuin Z d w idpntified with tbe sp&ce of all 

?fi x n inieser matrices. Let us fin positive integers 卜 ■ - ■ aim! 办 |, .0 n and 
Irl P C be the poly^iedmn of all non-ilftgfttivp m k n matrioKi weth njw surn& 
^i r ■ - ■ , fi' m Ami stim« i?|., . /i„ , Srippyw? that dim F — (m- I)(n — 1) and 

tlial P i.s h Kiiuplr 抑， Dr-Bintipi; Vl-S.l (which means that a ]l L ,. hQ ：™ 

anti .»n' Iti ft nufficiently spneric wa) r )- Prove that F is a CotaJly 

immittiiiElHr potylnpr. 

flemark: Morp sencmlly, a sufficiently generic 1 TAiutportation pdytope (see 
Section 11.7) \a tul^jEy iinimodulAr. Noiit-n^atiw integer nutriA^ with prftsoribed 
rem aQci culunin siihis arc r-dllird conting^tu^ tables. 

Lot u,.ii d e be linearly iiukpcn 如 nt lattice poirtt-s ftivd kt K = 

rr>(wj,. ,, . trj). Prow that h" cau be directed into the Jtilun uf utiitnoduiv codes, 
ttmt i.4, th(?re* in a decomposition 

P 

i=l 

where each coup A' n is Lmiinodiiliar and th* 1 inter section Ki^iK } q( every two distinct 
cones K t and Kj ia » proper f»c^ of byth, 

4. Lft h. •:— be a iiEiimodula; cone such that dim A^ = ii, Provt that (h« 
polar K Q C is a untmodulor cone. 

For tntallv iiniinorluUr polytopos, Braun's Theorem (Corollary 4-6) giv^t. a par- 
tiruiarly iiiw? identity- 
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Theorem 3. Let G be a bounded domain with a smooth boundary, and let D be a 
set of functions in G whose values and the values of their lirst derivatives arc 
uniformly bounded in G by a common bound m. Every sequence of functions in D 
containii a subsequence that converges in ihc maximum norm. 

Exercise i, (i) Show that a set of functions whose first derivatives are 
uniformly bounded in G are equiconlinuous in G. 

(ii) Use (i) and the Arzela-Ascoli theorem lo prove Theorem 3, 



APPENDIX 13 


A Characterization of Commutators 


We shall prove the following result* 


Theorem L An n x n matrix X the commutator of two n x n matrices A 
and B, 


X= AB - BA, (1) 

iff the tmt:e of X is zero. 


Pmof. We have shown in Theorem 7 of Chapter 5 that trace is commutative — 
that is, that 

tr AB - trBA. 

It follows that for X of form (1)，tr X = d We show now the converse, □ 


Lemma 2, Every matrix X all ol' whose diagonal emries are /.ero can be 
represented as a commutator 

Proof, We shall conslruci explicitly a pair of matrices A and B so thal (1) holds. 
We choose arbitrarily n distinct numbers … and define A to be the diagonal 
matrix with diagonal entries a t : 


{ 0 foriVJ, 

\ 0( for i — j. 


Wc define B as 





A 


“i 一 $ 

anylhing 


for #* /J, 
for I =/ 
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Then for 1 / j 

(AB — BA)# = a/B" — 

=( 叫 — a/jBij - Xij, 

while 

(AB — BA)" = — — 0. 

This verifies (1)* 口 

To complete the proof of Theorem 9 we make use of the observation that if X can 
be represented as a commutator, so can any matrix similar to X. This can be seen 
formally by multiplying equation ⑴ by S on the left and S _1 on the right: 

SXS _i =r SABS 1 — SBAS _I 

— (SAS - i)(SBS - l ) - (SBS—}(SAS - j 

Conceptually, we are using the observation that similar matrices represent the same 
mapping but in different coordinate systems. 

Lemma 3* Every matrix X whose trace is zero is similar to a matrix all whose 
diagonal entries are zero. 

Proof- Suppose nor all diagonal entries of X are zero, say Xu ^ 0, Then, since 
tr X = 0, there must be another diagonal entry, say 切 ， that is neither zero nor equal 
to jtji. Therefore the 2 x 2 minor in the upper left comer of X, 



is not a multiple of the identity. Therefore ihere is a vector with two components 
such that Yh is not a multiple of !u We introduce now h and Yh as new basis in U 2 ; 
with respect to this basis Y is represented by a matrix whose first diagonal element 
is zero. 

Continuing in this fashion we make changes of variables in two-dimensional 
subspaces lhat introduce a new zero on the diagonal of Ihe matrix representing X T 
without distroying any of the zeros that are already there, until there are n — 1 
zeros on the diagonal, Bui since tr X = 0, the remaining diagonal element is 
Eero loo, □ 


Combining Ixmma 2 and Lemma 3 gives Theorem 1, 
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Liapunov’s Theorem 


In (his Appendix we give a far-reaching extension of Theorem 20 in Chapter 10. We 
start by replacing Z in that result by its negative, W — — Z, 

Theorem L Let W be a mapping of a finite-dimensional Euclidean space into 
itself whose self-adjoint part is negative: 

W + W*<0, (1) 

Then ihe eigenvalues of W have negative real part. 

This can be proved Ihe same way as Theorem 20 was in Chapter 10. Wc state now 
a generalization of this result. 


Theorem 2* Let W be a mapping of a finirc-dimensional Euclidean space X into 
iiself. Let G be a positive self-adjoim map of X imo itself that satisfies the inequality 

gw + \r g < a (2) 


Then Ihe eigenvalues of W have negative real part. 


Proof Let h be an eigenvector of W, where w is the corresponding eigenvalue; 

WA = wh. (3) 

Let the left-hand side of (2) act on h, and take ihe scalar product with h: according 
to (2), 


((GW +ft) < 0. 


⑷ 
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which are of the form e w r are less thao I in absolute value. But then the spectral 
radius af the iiinximum of nil is also less than I: 

♦，< L (7) 

We conclude from (6) applied to A ^ ^ that 

|| ， "<(r(，} + < (8) 

where e tends to zero as j —^ oo* It follows from (?) and (8) that ||^ Wf |j decays 

exponentially as / oc through integer values* 

For t not an integer, we decompose / as r = j +/, where j is an integer and/is 
between 0 and 1, and we factor as 


So 

lk w, il<ik w/ lll|e w/ !|. (9) 

To estimate || £-^|| we replace in (5) W by Wj\ 



and apply the additive and multiplicative estimates for the norm of a matrix: 


OP 

Ik- l<El|W^||/^! 

o 

< y- II w llV^ g iiwn/ 

Since/lies between 0 and K 

We can use (8) and (10) to estimate the right-hand side of (9); ysing j = t —we 
gel 

(uj 

where € tends to zero as / —^ oo. According to (7), r : =r(e w ) < t; thus ii follows from 
(]i) that || || decays to zero at an exponential rale as i lends to oa □ 
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We show now that G, as defined by (12)，has the three properties required in 
Theorem 3: 

(t) G is self-adjoint. 

(it) G is positive. 

(iii) GW + W + G is negative. 

To show (i), we note that the adjoint of a mapping defined by an integral of the form 
(13) is 

h 

A ， (⑽ 

a 

ll follows ihnt if ihe integrand A(0 is self-adjoinl for each value of /, then so is the 
integral (13). Since the integrand in (12) is self-adjoint so is the integral G, as 
asserted io (i). 

To show (ii)，we make use of the observation that for an integral of form (13) and 
for any vector /i, 

b 

(ft ， k{i)dth) — (h, A(t)h)dt. 

J 4. 

a 

It follows that if the integrand A(/) is self-adjoint and positive, so is the integral (13). 
Since the integrand in (12) is self-adjoint and positive, 

(h ， e w \ w ，/!) = ^) = ||^|| 2 >0, 

so is the integral G. as asserted in (ii>. To prove (iii), we apply the factors W and W # 
under the integral sign: 

oo 

GW + W.G = Wd + WV w f ) 也 
o 

Nexi we observe lhai ihe integrand on the right is a derivative: 

di 

To see this, we use the rule for differentiating ihe prodocl of f anti e m : 

w; + (± w r 
dt \d\ 



(15) 


(16) 
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We combine this with the rule for differentiating exponential functions (see 
Theorem 5 of Chapter 9): 

—e ■= e W, -~e = e W = W e * 

dt di 

Setting these into (17) shows that (16) is indeed the integrand in (15): 

00 

GW +W + G = f -e w ^e Wf t/r (15/ 

J dt 

0 


We apply now Ihe fiindiimerUal theorem of calculus to evaluate the integral on ihe 
right of (15 / as 



Thus we have 

GW + W*G = —1 ， 

a negative self-adjoint mapping, as claimed in (iii). This completes the proof of 
Theorem 3, □ 

The pmol, and therefore the theorem, holds in infinite-dimensiona] Euclidean 
spaces. 
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The Jordan Canonical Form 


In this appendix we present a proof of the converse part of Theorem 12 in Chapter 6: 

Theorem L Let A and B be a pair of « x n matrices with the following 
properties, 

(i) A and B have the same eigenvalues e 】， … ， c*a. 

(ii) The dimension of the nullspaces 

N m (c；) = nullspace of (A - c ； I) ,rr 
and 

= null space of (B - Cjt) m 

are equal for all c: and all m\ 

dimN f „(^) = dim (I) 

Then A and B are similar, 

Pimtf, In Theorem 12 of Chapter 6, we have shown (hat these conditions are 
necessary for A and B to he similar* Wc show now that they aic sufficient by 
introducing a special basis in which the action of A is particularly simple and 
depends only on the eigenvalues Cj and the dimensions (I). We shall deal wilh each 
eigenvalue Cj separately; for simplicity we take c = cj to be zero. This can be 
accomplished by subtracting cl from A; at the end we shall add cl back. 
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APPRNDIX 15 ： THE JORDAN CANONICAL FORM 


The nullspaces of A 111 are nested: 

Ni C N 2 C … C Na 

where d is the index of llie eigenvalue c = 0. 

Lernnia 2, A maps the quoiient space N^+i/N/ into N ； /N, 1 . and this mapping 
is one-to-one. 

Proof. A maps N f+ j into N^; therefore A maps N^i/Nj into N|/N 卜卜 Let { 文 } be 
a non/ero equivalence class in N ， +i/N,y; this means that no x belongs 10 N f . H 
follows that Ax docs not belong to 1 ; this shows that A{x} — {A.x} is not the zero 
class in N ， /N ， 一 j. This proves Lemma 2. □ 

It follows from Lemma 2 that 

dim(N /+ ,/N0 < diin (Ny/N^,). (2) 

The special basis lor A in N" will be iniroduced in balches. The first batch, 

私 _ ■ ■, k = dim (N t t/N^ } ), (3) 

arc any k vectors in Nj that arc linearly independent mod Nj_ 1 , The next batch 
is 

A.vj r .. ^ A.%; (4) 

these belong to N f ,— 卜 and are linearly indtipendent mod N^/_ 2 . According to (2), with 
i — d — K dim (N f / i/N 心 2 ) 之 dim{N^/N f | 1 ) = /o- We choose the nexl batch of 
basis vectors in N,/ 1 , 

*^/y+ 1 ， ♦ "，^ '/| t ( 5 ) 

where /i = dim (Nj_|/N^_ 2 ), to complete the vectors (4) to a basis of N f /_i/N^_ 2 * 
The next baidi is 



• Xi^ ? Ajrf 0 +| ， " ， ■■'Vv/j - 


⑹ 


The next halch, 


Xji + 1 ， * * •，抑 2 ， 


⑺ 
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Recall that in order to simplify the presentation of the special basis we have 
replaced A by A — cl Putting back what wc have suhtraclcd leads to the following 
matrix form of the action of A: 

Cl … 0 

: i 

0 … c 

thill k, each entry along Ihe main diagonal is ihe eigenvalue c, 1 — s along the 
superdiagonal directly above iu and zeros everywhere else. A malrix of ihis form is 
called a Jordan block; when all Jordan blocks are put together, the resulting matrix is 
called a Jordan represent 扣 ion of the mapping A. 

The Jordan representation of A depends only 00 the eigenvalues of A and ihe 
dimension of the generalized eigenspaces Nj(auh j = \(h，k = 1 …… There¬ 
fore two matrices that have the same eigenvalues and the same-dimensional 
cigcnspaccs atid generalized eigcnspaces have the same Jordan representation* This 
shows that they are similar. □ 





APPENDIX 16 


Numerical Range 


Let X be a Euclidean space over the complex numbers, and let A be a mapping of X 
into X* 

DefinUion. The munerical range of A is the set of complex numbers 

(Aa 、 x )， ||x|| = 1. 

Nore that the eigenvalues of A belong to its numerical range. 

Definitimh The numerical radius ir(A) of A is ihe supreinuni of the absolute 
values in the numerical range of A: 

w(A)^ sup |( Aa 、 琳 （1) 

II 孑 I 卜 I 

Since the eigenvalues of A belong to its numerical range, the numerical radius of A 
is > its spectral radius: 

r(A)< w(A). (2) 

Exercise i. Show that for A normal, equalily holds in (2), 

Exerc ise 2. Show that for A normal, 

h<A) = ||A|1. (3) 
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APPRNDIX l6 ： NUMERICAL RANGE 


Lemma 1. (i) w(A) < || A ||, 

(ii) [i A|| < 2w(A), ⑷ 

Proof By the Schwarz inequality we have 

|(Ax ? x)I < ||Aa||||jl|| < || A || || t || 2 ; 

since 11,v11 ^ L part (i) follows* 

(ii) Decompose A imo its seif-adjoini and tinii-self-adjoim pans: 

A = S + /T. 

Then 

(A.t ? x) — (Sjc, x) + i(T.v ? jr) 
splits (A_u) into its real and imaginary parts: therefore 

I (Ax, x)\ > {Sx, x ) 7 [(Ax, x)\ > (Tx, x). 

Taking ihe supremum over all _U vectors x gives 

w(A) > w(S) ， w(A) > m ， (T). (5) 

Since S and T are self-adjoint, 

w(S) = ||S|| ? w(T)-l|T|I ， 

Adding the lwo inequalities (5% we get 

2w(A)2||S|M|T|I. 

Since || A |[ < 11 S ]| + ||T || ? (4) follows. □ 

Paul Halmas conjectured the following: 

Theorem 2, For A as above, 

—A”）< w(A) n (6) 

for eveiy positive integer ru 
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The first proof of this result was given by Charles Berger. The remarkable simple 
proof presented hci*e is Carl Pcarcy's* 

Lemma 3. Dcnole by t = 1 ， .. *, Ihe nth roots of unity: For 

all complex numbers z f 

1 - ^= nc — r 以)， ⑺ 

k 


and 



W l ~ rkZh 



Exercise 3 , Verify (7) and ( 8 ), 
Set A in place of z, we get 


1 一 ^=n g 一 q a ) ( 9 ) 

and 

j k^j 

Lei x be any unit vector \.x\\ — 1; denote 

JJ(I- = (I 1) 


Letting (9) act on x and using (11), wc get 

JC - A n x = (1 - rjA)xj, j + = 】，…， /n (12) 

From (10) acting on we get 


x 




(13) 


Take the scalar product of {12) with x; since j | x 11 = 1, we get on the left 


- (A"x”r) = (jc — A'c ， x) = 丄 (jt - A'r ， 


( 1 # 
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in the last step we have used (13). Next use (12) on (he right: 

By the definition of vv(A), 

|(A^,^)| < (15) 

Suppose that w{A) < L Then, since |q| = U it follows from (15) that ihc real part of 
(14)2 is naanegative. Since (14) 2 equals (14),, it follows that the real part of 
1 一 (A /! x,x) is nonnugative. 

Lei to be any complex number, \(o\ = I. From the definition of numerical radius, il 
follows that H7/(fwA) = iv(A); therefore, by the above argument, if iv(A) < we 
obtain 

1 - Re(WWU ■，和 1 - Re^{A^, x) > 0, 

Since this holds lor all a), \oj\ = 1, il follows ihai 

|(A^ 5 x )\ < I 

for all unit vectors jr. It follows that n ? (A ,r ) < 1 if h(A) < 1. 

Since 、 v(A> is a homogeneous function of degree I, 

w{-A) = I 咖 (A), 

and conclusion (6) of Theorem 2 follows. □ 

Combining Theorem 2 with Lemma l, wc obtain the lollowing: 

Corollary 2\ Let A denote an operator as above for which u t (A) < i t Then for 
all n, we have 

ii Am (i6) 

This corollary is useful for studying the stability of difference approximations of 
hyperbolic equations. 

Note L The proof of Theorem 2 nowhere makes use of the finite dimensionality 
of the Euclideiin space X. 

Note 2. Toeplitz and Hausdorff have proved that the numerical range of every 
mapping A is a convex subset of the complex plane. 


[(A _ x j) 


Nil 2 - 


(14). 


Exercise 4* Determine the numerical range of A = [) and of A 2 
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